Overview

Brought to you by YData

Dataset statistics

Number of variables100
Number of observations601451
Missing cells17024416
Missing cells (%)28.3%
Total size in memory458.9 MiB
Average record size in memory800.0 B

Variable types

Text100

Dataset

DescriptionMammal NMNH Extant Specimen Records 0054884-241126133413365
URLhttps://doi.org/10.15468/dl.dys66y

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "National Museum of Natural History, Smithsonian Institution" Constant
collectionID has constant value "urn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22" Constant
collectionCode has constant value "MAMM" Constant
datasetName has constant value "NMNH Extant Biology" Constant
occurrenceStatus has constant value "PRESENT" Constant
kingdom has constant value "Animalia" Constant
datasetKey has constant value "821cc27a-e3bb-4bc5-ac34-89ada245069d" Constant
publishingCountry has constant value "US" Constant
kingdomKey has constant value "1" Constant
protocol has constant value "EML" Constant
lastCrawled has constant value "2024-12-02T11:48:23.416Z" Constant
publishedByGbifRegion has constant value "NORTH_AMERICA" Constant
recordNumber has 50821 (8.4%) missing values Missing
recordedBy has 55563 (9.2%) missing values Missing
sex has 88216 (14.7%) missing values Missing
lifeStage has 550088 (91.5%) missing values Missing
preparations has 26965 (4.5%) missing values Missing
associatedSequences has 600397 (99.8%) missing values Missing
occurrenceRemarks has 590662 (98.2%) missing values Missing
eventDate has 28480 (4.7%) missing values Missing
startDayOfYear has 67487 (11.2%) missing values Missing
endDayOfYear has 67487 (11.2%) missing values Missing
year has 28519 (4.7%) missing values Missing
month has 45368 (7.5%) missing values Missing
day has 68254 (11.3%) missing values Missing
verbatimEventDate has 36490 (6.1%) missing values Missing
habitat has 468915 (78.0%) missing values Missing
continent has 39181 (6.5%) missing values Missing
waterBody has 539858 (89.8%) missing values Missing
islandGroup has 596682 (99.2%) missing values Missing
island has 564842 (93.9%) missing values Missing
stateProvince has 93954 (15.6%) missing values Missing
county has 447402 (74.4%) missing values Missing
locality has 35404 (5.9%) missing values Missing
verbatimElevation has 599861 (99.7%) missing values Missing
decimalLatitude has 447917 (74.5%) missing values Missing
decimalLongitude has 447917 (74.5%) missing values Missing
verbatimCoordinateSystem has 468202 (77.8%) missing values Missing
georeferenceProtocol has 592196 (98.5%) missing values Missing
georeferenceRemarks has 601383 (> 99.9%) missing values Missing
identificationQualifier has 599947 (99.7%) missing values Missing
typeStatus has 597715 (99.4%) missing values Missing
identifiedBy has 593267 (98.6%) missing values Missing
specificEpithet has 29657 (4.9%) missing values Missing
infraspecificEpithet has 386527 (64.3%) missing values Missing
elevation has 496901 (82.6%) missing values Missing
elevationAccuracy has 597572 (99.4%) missing values Missing
depth has 601448 (> 99.9%) missing values Missing
distanceFromCentroidInMeters has 601180 (> 99.9%) missing values Missing
mediaType has 45831 (7.6%) missing values Missing
speciesKey has 29663 (4.9%) missing values Missing
species has 29663 (4.9%) missing values Missing
gbifRegion has 15955 (2.7%) missing values Missing
level0Gid has 473902 (78.8%) missing values Missing
level0Name has 473902 (78.8%) missing values Missing
level1Gid has 473930 (78.8%) missing values Missing
level1Name has 473930 (78.8%) missing values Missing
level2Gid has 475037 (79.0%) missing values Missing
level2Name has 475037 (79.0%) missing values Missing
level3Gid has 539154 (89.6%) missing values Missing
level3Name has 539390 (89.7%) missing values Missing
iucnRedListCategory has 210302 (35.0%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-08 22:53:29.700947
Analysis finished2025-01-08 22:53:52.845809
Duration23.14 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct601451
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:53.193880image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters6014510
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique601451 ?
Unique (%)100.0%

Sample

1st row1322535732
2nd row1322538146
3rd row1317206206
4th row1317210025
5th row1317210456
ValueCountFrequency (%)
1322535732 1
 
< 0.1%
1322555094 1
 
< 0.1%
1322560018 1
 
< 0.1%
1322558352 1
 
< 0.1%
1317224532 1
 
< 0.1%
4041103536 1
 
< 0.1%
1317206206 1
 
< 0.1%
1317210025 1
 
< 0.1%
1317210456 1
 
< 0.1%
1317211504 1
 
< 0.1%
Other values (601441) 601441
> 99.9%
2025-01-08T17:53:53.646963image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6014510
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 6014510
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6014510
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:53.701567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters4210157
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 601451
100.0%
2025-01-08T17:53:53.787154image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1202902
28.6%
0 1202902
28.6%
_ 1202902
28.6%
1 601451
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1804353
42.9%
Uppercase Letter 1202902
28.6%
Connector Punctuation 1202902
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1202902
66.7%
1 601451
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 1202902
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3007255
71.4%
Latin 1202902
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1202902
40.0%
_ 1202902
40.0%
1 601451
20.0%
Latin
ValueCountFrequency (%)
C 1202902
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4210157
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1202902
28.6%
0 1202902
28.6%
_ 1202902
28.6%
1 601451
14.3%
Distinct29672
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:53.921061image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters12029020
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12662 ?
Unique (%)2.1%

Sample

1st row2021-08-09T14:50:00Z
2nd row2020-04-09T11:54:00Z
3rd row2020-03-17T10:16:00Z
4th row2020-05-20T10:50:00Z
5th row2017-12-08T15:28:00Z
ValueCountFrequency (%)
2021-01-11t15:15:00z 2641
 
0.4%
2023-02-10t10:31:00z 2632
 
0.4%
2021-08-09t14:46:00z 2522
 
0.4%
2020-07-20t15:30:00z 2313
 
0.4%
2017-12-08t15:27:00z 2105
 
0.3%
2021-08-09t14:49:00z 2096
 
0.3%
2017-12-08t15:33:00z 2050
 
0.3%
2017-12-08t15:36:00z 2008
 
0.3%
2020-07-24t16:11:00z 1979
 
0.3%
2017-12-08t15:35:00z 1972
 
0.3%
Other values (29662) 579133
96.3%
2025-01-08T17:53:54.090389image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3217264
26.7%
2 1683114
14.0%
1 1412891
11.7%
- 1202902
 
10.0%
: 1202902
 
10.0%
T 601451
 
5.0%
Z 601451
 
5.0%
4 455860
 
3.8%
3 439973
 
3.7%
5 428273
 
3.6%
Other values (4) 782939
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8420314
70.0%
Dash Punctuation 1202902
 
10.0%
Other Punctuation 1202902
 
10.0%
Uppercase Letter 1202902
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3217264
38.2%
2 1683114
20.0%
1 1412891
16.8%
4 455860
 
5.4%
3 439973
 
5.2%
5 428273
 
5.1%
9 215795
 
2.6%
6 207959
 
2.5%
7 187569
 
2.2%
8 171616
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1202902
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10826118
90.0%
Latin 1202902
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3217264
29.7%
2 1683114
15.5%
1 1412891
13.1%
- 1202902
 
11.1%
: 1202902
 
11.1%
4 455860
 
4.2%
3 439973
 
4.1%
5 428273
 
4.0%
9 215795
 
2.0%
6 207959
 
1.9%
Other values (2) 359185
 
3.3%
Latin
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12029020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3217264
26.7%
2 1683114
14.0%
1 1412891
11.7%
- 1202902
 
10.0%
: 1202902
 
10.0%
T 601451
 
5.0%
Z 601451
 
5.0%
4 455860
 
3.8%
3 439973
 
3.7%
5 428273
 
3.6%
Other values (4) 782939
 
6.5%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:54.153205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters35485609
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational Museum of Natural History, Smithsonian Institution
2nd rowNational Museum of Natural History, Smithsonian Institution
3rd rowNational Museum of Natural History, Smithsonian Institution
4th rowNational Museum of Natural History, Smithsonian Institution
5th rowNational Museum of Natural History, Smithsonian Institution
ValueCountFrequency (%)
national 601451
14.3%
museum 601451
14.3%
of 601451
14.3%
natural 601451
14.3%
history 601451
14.3%
smithsonian 601451
14.3%
institution 601451
14.3%
2025-01-08T17:53:54.259221image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 4210157
11.9%
i 3608706
10.2%
3608706
10.2%
a 3007255
 
8.5%
o 3007255
 
8.5%
n 3007255
 
8.5%
s 2405804
 
6.8%
u 2405804
 
6.8%
r 1202902
 
3.4%
m 1202902
 
3.4%
Other values (11) 7818863
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27666746
78.0%
Space Separator 3608706
 
10.2%
Uppercase Letter 3608706
 
10.2%
Other Punctuation 601451
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 4210157
15.2%
i 3608706
13.0%
a 3007255
10.9%
o 3007255
10.9%
n 3007255
10.9%
s 2405804
8.7%
u 2405804
8.7%
r 1202902
 
4.3%
m 1202902
 
4.3%
l 1202902
 
4.3%
Other values (4) 2405804
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 1202902
33.3%
M 601451
16.7%
H 601451
16.7%
S 601451
16.7%
I 601451
16.7%
Space Separator
ValueCountFrequency (%)
3608706
100.0%
Other Punctuation
ValueCountFrequency (%)
, 601451
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31275452
88.1%
Common 4210157
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 4210157
13.5%
i 3608706
11.5%
a 3007255
9.6%
o 3007255
9.6%
n 3007255
9.6%
s 2405804
 
7.7%
u 2405804
 
7.7%
r 1202902
 
3.8%
m 1202902
 
3.8%
N 1202902
 
3.8%
Other values (9) 6014510
19.2%
Common
ValueCountFrequency (%)
3608706
85.7%
, 601451
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35485609
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 4210157
11.9%
i 3608706
10.2%
3608706
10.2%
a 3007255
 
8.5%
o 3007255
 
8.5%
n 3007255
 
8.5%
s 2405804
 
6.8%
u 2405804
 
6.8%
r 1202902
 
3.4%
m 1202902
 
3.4%
Other values (11) 7818863
22.0%
Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:54.318799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length28.8108624
Min length2

Characters and Unicode

Total characters17328322
Distinct characters41
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 596967
99.3%
nsmt 977
 
0.2%
uam 775
 
0.1%
nrm 386
 
0.1%
rmnh 354
 
0.1%
rcs 246
 
< 0.1%
nmv 238
 
< 0.1%
nmsz 188
 
< 0.1%
zmmu 179
 
< 0.1%
fcmm 127
 
< 0.1%
Other values (40) 1015
 
0.2%
2025-01-08T17:53:54.431188image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2387868
13.8%
: 2387868
13.8%
l 1790901
 
10.3%
i 1193934
 
6.9%
r 1193934
 
6.9%
c 1193934
 
6.9%
g 596967
 
3.4%
7 596967
 
3.4%
8 596967
 
3.4%
4 596967
 
3.4%
Other values (31) 4792015
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11342373
65.5%
Other Punctuation 2984837
 
17.2%
Decimal Number 2984835
 
17.2%
Uppercase Letter 16276
 
0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 4384
26.9%
N 2583
15.9%
S 1796
11.0%
A 1319
 
8.1%
U 1175
 
7.2%
R 1035
 
6.4%
T 978
 
6.0%
C 551
 
3.4%
H 550
 
3.4%
Z 467
 
2.9%
Other values (11) 1438
 
8.8%
Lowercase Letter
ValueCountFrequency (%)
o 2387868
21.1%
l 1790901
15.8%
i 1193934
10.5%
r 1193934
10.5%
c 1193934
10.5%
g 596967
 
5.3%
u 596967
 
5.3%
b 596967
 
5.3%
d 596967
 
5.3%
s 596967
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 596967
20.0%
8 596967
20.0%
4 596967
20.0%
3 596967
20.0%
1 596967
20.0%
Other Punctuation
ValueCountFrequency (%)
: 2387868
80.0%
. 596967
 
20.0%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11358649
65.5%
Common 5969673
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2387868
21.0%
l 1790901
15.8%
i 1193934
10.5%
r 1193934
10.5%
c 1193934
10.5%
g 596967
 
5.3%
u 596967
 
5.3%
b 596967
 
5.3%
d 596967
 
5.3%
s 596967
 
5.3%
Other values (22) 613243
 
5.4%
Common
ValueCountFrequency (%)
: 2387868
40.0%
7 596967
 
10.0%
8 596967
 
10.0%
4 596967
 
10.0%
3 596967
 
10.0%
. 596967
 
10.0%
1 596967
 
10.0%
? 2
 
< 0.1%
1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17328322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2387868
13.8%
: 2387868
13.8%
l 1790901
 
10.3%
i 1193934
 
6.9%
r 1193934
 
6.9%
c 1193934
 
6.9%
g 596967
 
3.4%
7 596967
 
3.4%
8 596967
 
3.4%
4 596967
 
3.4%
Other values (31) 4792015
27.7%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:54.485183image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters27065295
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
2nd rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
3rd rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
4th rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
5th rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
ValueCountFrequency (%)
urn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22 601451
100.0%
2025-01-08T17:53:54.588381image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 3007255
 
11.1%
- 2405804
 
8.9%
5 2405804
 
8.9%
6 1804353
 
6.7%
e 1804353
 
6.7%
u 1804353
 
6.7%
d 1202902
 
4.4%
9 1202902
 
4.4%
: 1202902
 
4.4%
1 1202902
 
4.4%
Other values (12) 9021765
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13833373
51.1%
Lowercase Letter 9623216
35.6%
Dash Punctuation 2405804
 
8.9%
Other Punctuation 1202902
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 3007255
21.7%
5 2405804
17.4%
6 1804353
13.0%
9 1202902
 
8.7%
1 1202902
 
8.7%
4 1202902
 
8.7%
2 1202902
 
8.7%
0 601451
 
4.3%
3 601451
 
4.3%
7 601451
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
e 1804353
18.8%
u 1804353
18.8%
d 1202902
12.5%
b 1202902
12.5%
i 601451
 
6.2%
a 601451
 
6.2%
r 601451
 
6.2%
n 601451
 
6.2%
c 601451
 
6.2%
f 601451
 
6.2%
Dash Punctuation
ValueCountFrequency (%)
- 2405804
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17442079
64.4%
Latin 9623216
35.6%

Most frequent character per script

Common
ValueCountFrequency (%)
8 3007255
17.2%
- 2405804
13.8%
5 2405804
13.8%
6 1804353
10.3%
9 1202902
 
6.9%
: 1202902
 
6.9%
1 1202902
 
6.9%
4 1202902
 
6.9%
2 1202902
 
6.9%
0 601451
 
3.4%
Other values (2) 1202902
 
6.9%
Latin
ValueCountFrequency (%)
e 1804353
18.8%
u 1804353
18.8%
d 1202902
12.5%
b 1202902
12.5%
i 601451
 
6.2%
a 601451
 
6.2%
r 601451
 
6.2%
n 601451
 
6.2%
c 601451
 
6.2%
f 601451
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27065295
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 3007255
 
11.1%
- 2405804
 
8.9%
5 2405804
 
8.9%
6 1804353
 
6.7%
e 1804353
 
6.7%
u 1804353
 
6.7%
d 1202902
 
4.4%
9 1202902
 
4.4%
: 1202902
 
4.4%
1 1202902
 
4.4%
Other values (12) 9021765
33.3%
Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:54.645649image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length4
Mean length3.997244996
Min length2

Characters and Unicode

Total characters2404147
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 596967
99.3%
nsmt 977
 
0.2%
uam 775
 
0.1%
nrm 386
 
0.1%
rmnh 354
 
0.1%
rcs 246
 
< 0.1%
nmv 238
 
< 0.1%
nmsz 188
 
< 0.1%
zmmu 179
 
< 0.1%
fcmm 127
 
< 0.1%
Other values (40) 1015
 
0.2%
2025-01-08T17:53:54.755956image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (13) 1441
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2404144
> 99.9%
Other Punctuation 2
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (11) 1438
 
0.1%
Other Punctuation
ValueCountFrequency (%)
? 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2404144
> 99.9%
Common 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (11) 1438
 
0.1%
Common
ValueCountFrequency (%)
? 2
66.7%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2404147
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (13) 1441
 
0.1%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:54.794956image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2405804
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMAMM
2nd rowMAMM
3rd rowMAMM
4th rowMAMM
5th rowMAMM
ValueCountFrequency (%)
mamm 601451
100.0%
2025-01-08T17:53:54.878885image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2405804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2405804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2405804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:54.917884image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11427569
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 601451
33.3%
extant 601451
33.3%
biology 601451
33.3%
2025-01-08T17:53:55.099952image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1202902
 
10.5%
1202902
 
10.5%
t 1202902
 
10.5%
o 1202902
 
10.5%
M 601451
 
5.3%
H 601451
 
5.3%
E 601451
 
5.3%
x 601451
 
5.3%
a 601451
 
5.3%
n 601451
 
5.3%
Other values (5) 3007255
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6615961
57.9%
Uppercase Letter 3608706
31.6%
Space Separator 1202902
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1202902
18.2%
o 1202902
18.2%
x 601451
9.1%
a 601451
9.1%
n 601451
9.1%
i 601451
9.1%
l 601451
9.1%
g 601451
9.1%
y 601451
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1202902
33.3%
M 601451
16.7%
H 601451
16.7%
E 601451
16.7%
B 601451
16.7%
Space Separator
ValueCountFrequency (%)
1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10224667
89.5%
Common 1202902
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1202902
11.8%
t 1202902
11.8%
o 1202902
11.8%
M 601451
 
5.9%
H 601451
 
5.9%
E 601451
 
5.9%
x 601451
 
5.9%
a 601451
 
5.9%
n 601451
 
5.9%
B 601451
 
5.9%
Other values (4) 2405804
23.5%
Common
ValueCountFrequency (%)
1202902
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11427569
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1202902
 
10.5%
1202902
 
10.5%
t 1202902
 
10.5%
o 1202902
 
10.5%
M 601451
 
5.3%
H 601451
 
5.3%
E 601451
 
5.3%
x 601451
 
5.3%
a 601451
 
5.3%
n 601451
 
5.3%
Other values (5) 3007255
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:55.147292image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length17.95205428
Min length17

Characters and Unicode

Total characters10797281
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowHUMAN_OBSERVATION
ValueCountFrequency (%)
preserved_specimen 572614
95.2%
human_observation 28837
 
4.8%
2025-01-08T17:53:55.243708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 2891907
26.8%
R 1174065
10.9%
S 1174065
10.9%
P 1145228
 
10.6%
N 630288
 
5.8%
M 601451
 
5.6%
I 601451
 
5.6%
_ 601451
 
5.6%
V 601451
 
5.6%
C 572614
 
5.3%
Other values (7) 803310
 
7.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10195830
94.4%
Connector Punctuation 601451
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 2891907
28.4%
R 1174065
11.5%
S 1174065
11.5%
P 1145228
 
11.2%
N 630288
 
6.2%
M 601451
 
5.9%
I 601451
 
5.9%
V 601451
 
5.9%
C 572614
 
5.6%
D 572614
 
5.6%
Other values (6) 230696
 
2.3%
Connector Punctuation
ValueCountFrequency (%)
_ 601451
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10195830
94.4%
Common 601451
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 2891907
28.4%
R 1174065
11.5%
S 1174065
11.5%
P 1145228
 
11.2%
N 630288
 
6.2%
M 601451
 
5.9%
I 601451
 
5.9%
V 601451
 
5.9%
C 572614
 
5.6%
D 572614
 
5.6%
Other values (6) 230696
 
2.3%
Common
ValueCountFrequency (%)
_ 601451
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10797281
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 2891907
26.8%
R 1174065
10.9%
S 1174065
10.9%
P 1145228
 
10.6%
N 630288
 
5.8%
M 601451
 
5.6%
I 601451
 
5.6%
_ 601451
 
5.6%
V 601451
 
5.6%
C 572614
 
5.3%
Other values (7) 803310
 
7.4%

occurrenceID
Text

Unique 

Distinct601451
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:55.532837image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters37891413
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique601451 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3ebec6a7f-5e95-4543-b061-6d73d80dd2ee
2nd rowhttp://n2t.net/ark:/65665/3ec070d5d-1893-4600-afa5-e56695ff219b
3rd rowhttp://n2t.net/ark:/65665/3002acaf9-9788-4539-8883-fe6bfd5f8d88
4th rowhttp://n2t.net/ark:/65665/300553499-1544-460e-9507-55ada241f992
5th rowhttp://n2t.net/ark:/65665/3005a3503-9c20-443c-899a-559e550dc71e
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3ebec6a7f-5e95-4543-b061-6d73d80dd2ee 1
 
< 0.1%
http://n2t.net/ark:/65665/3ecc76d35-e5c5-434e-874b-88c5d85dbb91 1
 
< 0.1%
http://n2t.net/ark:/65665/3ecff6276-27d1-4ad7-aac3-32c485b9bed6 1
 
< 0.1%
http://n2t.net/ark:/65665/3eceb4d85-2fbe-4bf2-aef7-b3393445f319 1
 
< 0.1%
http://n2t.net/ark:/65665/300f96572-4f6d-48dc-9b78-1ba0e03bb0ae 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec5d68e1-4786-40d2-9bdb-bb8ef2ad056d 1
 
< 0.1%
http://n2t.net/ark:/65665/3002acaf9-9788-4539-8883-fe6bfd5f8d88 1
 
< 0.1%
http://n2t.net/ark:/65665/300553499-1544-460e-9507-55ada241f992 1
 
< 0.1%
http://n2t.net/ark:/65665/3005a3503-9c20-443c-899a-559e550dc71e 1
 
< 0.1%
http://n2t.net/ark:/65665/300664e6c-5334-4a8e-b9a7-4d84389595e0 1
 
< 0.1%
Other values (601441) 601441
> 99.9%
2025-01-08T17:53:55.900265image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 3007255
 
7.9%
6 2930823
 
7.7%
- 2405804
 
6.3%
t 2405804
 
6.3%
5 2330760
 
6.2%
a 1878835
 
5.0%
e 1729856
 
4.6%
2 1729289
 
4.6%
3 1728046
 
4.6%
4 1727823
 
4.6%
Other values (16) 16017118
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16387822
43.2%
Lowercase Letter 14286179
37.7%
Other Punctuation 4811608
 
12.7%
Dash Punctuation 2405804
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2405804
16.8%
a 1878835
13.2%
e 1729856
12.1%
b 1278851
9.0%
n 1202902
8.4%
f 1128774
7.9%
c 1128212
7.9%
d 1127141
7.9%
k 601451
 
4.2%
r 601451
 
4.2%
Other values (2) 1202902
8.4%
Decimal Number
ValueCountFrequency (%)
6 2930823
17.9%
5 2330760
14.2%
2 1729289
10.6%
3 1728046
10.5%
4 1727823
10.5%
9 1279292
7.8%
8 1278534
7.8%
0 1129193
 
6.9%
7 1127612
 
6.9%
1 1126450
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 3007255
62.5%
: 1202902
 
25.0%
. 601451
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2405804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23605234
62.3%
Latin 14286179
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 3007255
12.7%
6 2930823
12.4%
- 2405804
10.2%
5 2330760
9.9%
2 1729289
7.3%
3 1728046
7.3%
4 1727823
7.3%
9 1279292
 
5.4%
8 1278534
 
5.4%
: 1202902
 
5.1%
Other values (4) 3984706
16.9%
Latin
ValueCountFrequency (%)
t 2405804
16.8%
a 1878835
13.2%
e 1729856
12.1%
b 1278851
9.0%
n 1202902
8.4%
f 1128774
7.9%
c 1128212
7.9%
d 1127141
7.9%
k 601451
 
4.2%
r 601451
 
4.2%
Other values (2) 1202902
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37891413
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 3007255
 
7.9%
6 2930823
 
7.7%
- 2405804
 
6.3%
t 2405804
 
6.3%
5 2330760
 
6.2%
a 1878835
 
5.0%
e 1729856
 
4.6%
2 1729289
 
4.6%
3 1728046
 
4.6%
4 1727823
 
4.6%
Other values (16) 16017118
42.3%
Distinct601428
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:56.309142image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length11
Mean length10.92069179
Min length4

Characters and Unicode

Total characters6568261
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique601407 ?
Unique (%)> 99.9%

Sample

1st rowUSNM 449558
2nd rowUSNM 226903
3rd rowUSNM 386480
4th rowUSNM 68620
5th rowUSNM MME9342
ValueCountFrequency (%)
usnm 596967
49.8%
wam 63
 
< 0.1%
mb 40
 
< 0.1%
zin 21
 
< 0.1%
lacm 18
 
< 0.1%
nsmt 12
 
< 0.1%
sama 6
 
< 0.1%
zmmu 5
 
< 0.1%
rmnh 4
 
< 0.1%
ncsm 4
 
< 0.1%
Other values (601439) 601471
50.2%
2025-01-08T17:53:56.758403image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 627122
9.5%
S 616877
 
9.4%
N 601401
 
9.2%
U 598144
 
9.1%
597160
 
9.1%
1 405808
 
6.2%
2 403390
 
6.1%
3 394478
 
6.0%
5 393693
 
6.0%
4 379861
 
5.8%
Other values (25) 1550327
23.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3465081
52.8%
Uppercase Letter 2506018
38.2%
Space Separator 597160
 
9.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 627122
25.0%
S 616877
24.6%
N 601401
24.0%
U 598144
23.9%
R 17298
 
0.7%
T 17251
 
0.7%
E 14721
 
0.6%
A 10176
 
0.4%
C 553
 
< 0.1%
H 550
 
< 0.1%
Other values (13) 1925
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 405808
11.7%
2 403390
11.6%
3 394478
11.4%
5 393693
11.4%
4 379861
11.0%
6 309193
8.9%
7 297996
8.6%
0 295420
8.5%
8 295286
8.5%
9 289956
8.4%
Space Separator
ValueCountFrequency (%)
597160
100.0%
Other Punctuation
ValueCountFrequency (%)
? 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4062243
61.8%
Latin 2506018
38.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 627122
25.0%
S 616877
24.6%
N 601401
24.0%
U 598144
23.9%
R 17298
 
0.7%
T 17251
 
0.7%
E 14721
 
0.6%
A 10176
 
0.4%
C 553
 
< 0.1%
H 550
 
< 0.1%
Other values (13) 1925
 
0.1%
Common
ValueCountFrequency (%)
597160
14.7%
1 405808
10.0%
2 403390
9.9%
3 394478
9.7%
5 393693
9.7%
4 379861
9.4%
6 309193
7.6%
7 297996
7.3%
0 295420
7.3%
8 295286
7.3%
Other values (2) 289958
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6568261
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 627122
9.5%
S 616877
 
9.4%
N 601401
 
9.2%
U 598144
 
9.1%
597160
 
9.1%
1 405808
 
6.2%
2 403390
 
6.1%
3 394478
 
6.0%
5 393693
 
6.0%
4 379861
 
5.8%
Other values (25) 1550327
23.6%

recordNumber
Text

Missing 

Distinct172937
Distinct (%)31.4%
Missing50821
Missing (%)8.4%
Memory size4.6 MiB
2025-01-08T17:53:56.944864image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length28
Mean length5.176632221
Min length1

Characters and Unicode

Total characters2850409
Distinct characters79
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147848 ?
Unique (%)26.9%

Sample

1st rowFMG 2371
2nd row142/19534X
3rd row07960
4th row6459
5th rowB47586/R50468
ValueCountFrequency (%)
no 47434
 
6.9%
number 47222
 
6.9%
cohjr 5988
 
0.9%
nzp 3372
 
0.5%
psc 2713
 
0.4%
jwk 2021
 
0.3%
r 1947
 
0.3%
fm 1793
 
0.3%
jjg 1781
 
0.3%
rem 1569
 
0.2%
Other values (105383) 570874
83.1%
2025-01-08T17:53:57.197493image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 307242
 
10.8%
2 246234
 
8.6%
3 208467
 
7.3%
4 190900
 
6.7%
0 182605
 
6.4%
5 181877
 
6.4%
6 173588
 
6.1%
7 165796
 
5.8%
8 159989
 
5.6%
9 153227
 
5.4%
Other values (69) 880484
30.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1969925
69.1%
Uppercase Letter 409557
 
14.4%
Lowercase Letter 285569
 
10.0%
Space Separator 136084
 
4.8%
Other Punctuation 26739
 
0.9%
Dash Punctuation 20734
 
0.7%
Close Punctuation 888
 
< 0.1%
Open Punctuation 886
 
< 0.1%
Currency Symbol 13
 
< 0.1%
Math Symbol 10
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 106292
26.0%
R 28947
 
7.1%
M 24702
 
6.0%
J 23837
 
5.8%
C 21743
 
5.3%
H 19696
 
4.8%
X 17857
 
4.4%
B 15635
 
3.8%
P 15412
 
3.8%
E 14048
 
3.4%
Other values (16) 121388
29.6%
Lowercase Letter
ValueCountFrequency (%)
r 47347
16.6%
e 47325
16.6%
o 47216
16.5%
m 47180
16.5%
u 47177
16.5%
b 47174
16.5%
n 1310
 
0.5%
a 152
 
0.1%
p 115
 
< 0.1%
i 108
 
< 0.1%
Other values (13) 465
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 307242
15.6%
2 246234
12.5%
3 208467
10.6%
4 190900
9.7%
0 182605
9.3%
5 181877
9.2%
6 173588
8.8%
7 165796
8.4%
8 159989
8.1%
9 153227
7.8%
Other Punctuation
ValueCountFrequency (%)
/ 23475
87.8%
. 2050
 
7.7%
, 626
 
2.3%
# 248
 
0.9%
? 202
 
0.8%
& 47
 
0.2%
; 44
 
0.2%
: 22
 
0.1%
* 21
 
0.1%
' 4
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 887
99.9%
] 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 885
99.9%
[ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
= 6
60.0%
+ 4
40.0%
Space Separator
ValueCountFrequency (%)
136084
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20734
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 13
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2155283
75.6%
Latin 695126
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 106292
15.3%
r 47347
 
6.8%
e 47325
 
6.8%
o 47216
 
6.8%
m 47180
 
6.8%
u 47177
 
6.8%
b 47174
 
6.8%
R 28947
 
4.2%
M 24702
 
3.6%
J 23837
 
3.4%
Other values (39) 227929
32.8%
Common
ValueCountFrequency (%)
1 307242
14.3%
2 246234
11.4%
3 208467
9.7%
4 190900
8.9%
0 182605
8.5%
5 181877
8.4%
6 173588
8.1%
7 165796
7.7%
8 159989
7.4%
9 153227
7.1%
Other values (20) 185358
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2850396
> 99.9%
None 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 307242
 
10.8%
2 246234
 
8.6%
3 208467
 
7.3%
4 190900
 
6.7%
0 182605
 
6.4%
5 181877
 
6.4%
6 173588
 
6.1%
7 165796
 
5.8%
8 159989
 
5.6%
9 153227
 
5.4%
Other values (68) 880471
30.9%
None
ValueCountFrequency (%)
¢ 13
100.0%

recordedBy
Text

Missing 

Distinct17644
Distinct (%)3.2%
Missing55563
Missing (%)9.2%
Memory size4.6 MiB
2025-01-08T17:53:57.375048image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length124
Median length114
Mean length11.92282483
Min length1

Characters and Unicode

Total characters6508527
Distinct characters80
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9079 ?
Unique (%)1.7%

Sample

1st rowF. Greenwell
2nd rowJ. Silver
3rd rowSmithsonian Venezuelan Project
4th rowNelson & E. Goldman
5th rowW. Bowen & V. Thayer
ValueCountFrequency (%)
j 60783
 
4.7%
e 54366
 
4.2%
c 53496
 
4.2%
50457
 
3.9%
r 49868
 
3.9%
a 44074
 
3.4%
w 37880
 
2.9%
h 30720
 
2.4%
d 24753
 
1.9%
m 23831
 
1.9%
Other values (10447) 856734
66.6%
2025-01-08T17:53:57.626708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
741074
 
11.4%
e 563544
 
8.7%
. 539103
 
8.3%
n 389678
 
6.0%
a 341353
 
5.2%
o 335107
 
5.1%
r 327053
 
5.0%
l 295446
 
4.5%
i 245022
 
3.8%
s 228632
 
3.5%
Other values (70) 2502515
38.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3897970
59.9%
Uppercase Letter 1254996
 
19.3%
Space Separator 741074
 
11.4%
Other Punctuation 599060
 
9.2%
Close Punctuation 5447
 
0.1%
Open Punctuation 5376
 
0.1%
Dash Punctuation 2452
 
< 0.1%
Decimal Number 2151
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 563544
14.5%
n 389678
10.0%
a 341353
8.8%
o 335107
8.6%
r 327053
 
8.4%
l 295446
 
7.6%
i 245022
 
6.3%
s 228632
 
5.9%
t 223935
 
5.7%
h 116266
 
3.0%
Other values (18) 831934
21.3%
Uppercase Letter
ValueCountFrequency (%)
R 91216
 
7.3%
M 88625
 
7.1%
C 87417
 
7.0%
S 86724
 
6.9%
H 84189
 
6.7%
G 82831
 
6.6%
J 76177
 
6.1%
A 70972
 
5.7%
E 64988
 
5.2%
P 62861
 
5.0%
Other values (16) 458996
36.6%
Other Punctuation
ValueCountFrequency (%)
. 539103
90.0%
& 50656
 
8.5%
, 8029
 
1.3%
' 1002
 
0.2%
/ 114
 
< 0.1%
: 78
 
< 0.1%
? 29
 
< 0.1%
" 26
 
< 0.1%
; 13
 
< 0.1%
# 10
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1561
72.6%
8 243
 
11.3%
2 219
 
10.2%
4 34
 
1.6%
6 33
 
1.5%
0 31
 
1.4%
9 12
 
0.6%
5 8
 
0.4%
3 7
 
0.3%
7 3
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 5375
> 99.9%
[ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
741074
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5447
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2452
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5152966
79.2%
Common 1355561
 
20.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 563544
 
10.9%
n 389678
 
7.6%
a 341353
 
6.6%
o 335107
 
6.5%
r 327053
 
6.3%
l 295446
 
5.7%
i 245022
 
4.8%
s 228632
 
4.4%
t 223935
 
4.3%
h 116266
 
2.3%
Other values (44) 2086930
40.5%
Common
ValueCountFrequency (%)
741074
54.7%
. 539103
39.8%
& 50656
 
3.7%
, 8029
 
0.6%
) 5447
 
0.4%
( 5375
 
0.4%
- 2452
 
0.2%
1 1561
 
0.1%
' 1002
 
0.1%
8 243
 
< 0.1%
Other values (16) 619
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6508521
> 99.9%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
741074
 
11.4%
e 563544
 
8.7%
. 539103
 
8.3%
n 389678
 
6.0%
a 341353
 
5.2%
o 335107
 
5.1%
r 327053
 
5.0%
l 295446
 
4.5%
i 245022
 
3.8%
s 228632
 
3.5%
Other values (68) 2502509
38.4%
None
ValueCountFrequency (%)
ç 3
50.0%
ā 3
50.0%
Distinct21
Distinct (%)< 0.1%
Missing44
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:53:57.686410image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length1
Mean length1.000033255
Min length1

Characters and Unicode

Total characters601427
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 601314
> 99.9%
2 45
 
< 0.1%
6 8
 
< 0.1%
3 8
 
< 0.1%
4 6
 
< 0.1%
7 5
 
< 0.1%
5 4
 
< 0.1%
271 2
 
< 0.1%
11 2
 
< 0.1%
20 2
 
< 0.1%
Other values (11) 11
 
< 0.1%
2025-01-08T17:53:57.790987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 601427
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 601427
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 601427
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

sex
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing88216
Missing (%)14.7%
Memory size4.6 MiB
2025-01-08T17:53:57.829842image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.961610179
Min length4

Characters and Unicode

Total characters2546472
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMALE
2nd rowMALE
3rd rowMALE
4th rowFEMALE
5th rowFEMALE
ValueCountFrequency (%)
male 266469
51.9%
female 246766
48.1%
2025-01-08T17:53:57.926205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 760001
29.8%
M 513235
20.2%
A 513235
20.2%
L 513235
20.2%
F 246766
 
9.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2546472
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 760001
29.8%
M 513235
20.2%
A 513235
20.2%
L 513235
20.2%
F 246766
 
9.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 2546472
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 760001
29.8%
M 513235
20.2%
A 513235
20.2%
L 513235
20.2%
F 246766
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2546472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 760001
29.8%
M 513235
20.2%
A 513235
20.2%
L 513235
20.2%
F 246766
 
9.7%

lifeStage
Text

Missing 

Distinct10
Distinct (%)< 0.1%
Missing550088
Missing (%)91.5%
Memory size4.6 MiB
2025-01-08T17:53:57.969208image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length5
Mean length6.093024161
Min length5

Characters and Unicode

Total characters312956
Distinct characters27
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAdult
2nd rowAdult
3rd rowJuvenile
4th rowJuvenile
5th rowAdult
ValueCountFrequency (%)
adult 31097
60.5%
juvenile 11486
 
22.4%
immature 3896
 
7.6%
subadult 2153
 
4.2%
embryo 983
 
1.9%
fetus 681
 
1.3%
nestling 499
 
1.0%
neonate 448
 
0.9%
mature 80
 
0.2%
unknown 40
 
0.1%
2025-01-08T17:53:58.067780image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 51546
16.5%
l 45235
14.5%
t 38854
12.4%
d 33250
10.6%
A 31097
9.9%
e 29024
9.3%
n 12553
 
4.0%
i 11985
 
3.8%
J 11486
 
3.7%
v 11486
 
3.7%
Other values (17) 36440
11.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 261593
83.6%
Uppercase Letter 51363
 
16.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 51546
19.7%
l 45235
17.3%
t 38854
14.9%
d 33250
12.7%
e 29024
11.1%
n 12553
 
4.8%
i 11985
 
4.6%
v 11486
 
4.4%
m 8775
 
3.4%
a 6577
 
2.5%
Other values (8) 12308
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
A 31097
60.5%
J 11486
 
22.4%
I 3896
 
7.6%
S 2153
 
4.2%
E 983
 
1.9%
N 947
 
1.8%
F 681
 
1.3%
M 80
 
0.2%
U 40
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 312956
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 51546
16.5%
l 45235
14.5%
t 38854
12.4%
d 33250
10.6%
A 31097
9.9%
e 29024
9.3%
n 12553
 
4.0%
i 11985
 
3.8%
J 11486
 
3.7%
v 11486
 
3.7%
Other values (17) 36440
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 312956
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 51546
16.5%
l 45235
14.5%
t 38854
12.4%
d 33250
10.6%
A 31097
9.9%
e 29024
9.3%
n 12553
 
4.0%
i 11985
 
3.8%
J 11486
 
3.7%
v 11486
 
3.7%
Other values (17) 36440
11.6%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:53:58.109784image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters4210157
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 601451
100.0%
2025-01-08T17:53:58.197110image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1202902
28.6%
P 601451
14.3%
R 601451
14.3%
S 601451
14.3%
N 601451
14.3%
T 601451
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4210157
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1202902
28.6%
P 601451
14.3%
R 601451
14.3%
S 601451
14.3%
N 601451
14.3%
T 601451
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 4210157
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1202902
28.6%
P 601451
14.3%
R 601451
14.3%
S 601451
14.3%
N 601451
14.3%
T 601451
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4210157
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1202902
28.6%
P 601451
14.3%
R 601451
14.3%
S 601451
14.3%
N 601451
14.3%
T 601451
14.3%

preparations
Text

Missing 

Distinct542
Distinct (%)0.1%
Missing26965
Missing (%)4.5%
Memory size4.6 MiB
2025-01-08T17:53:58.253271image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length73
Median length11
Mean length10.02423558
Min length4

Characters and Unicode

Total characters5758783
Distinct characters49
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique248 ?
Unique (%)< 0.1%

Sample

1st rowSkin; Skull
2nd rowSkin; Skull
3rd rowSkin; Skull
4th rowSkin; Skull
5th rowSkin; Skull
ValueCountFrequency (%)
skull 452764
44.7%
skin 367609
36.3%
fluid 101452
 
10.0%
skeleton 36584
 
3.6%
partial 10316
 
1.0%
in 8642
 
0.9%
remainder 8641
 
0.9%
anatomical 5878
 
0.6%
baculum/baubellum 3372
 
0.3%
baleen 2349
 
0.2%
Other values (42) 14726
 
1.5%
2025-01-08T17:53:58.379436image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 1076304
18.7%
k 859539
14.9%
S 856659
14.9%
u 570461
9.9%
i 506031
8.8%
437847
7.6%
n 435543
7.6%
; 404417
 
7.0%
d 111124
 
1.9%
e 103346
 
1.8%
Other values (39) 397512
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3909067
67.9%
Uppercase Letter 1004072
 
17.4%
Space Separator 437847
 
7.6%
Other Punctuation 407794
 
7.1%
Decimal Number 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1076304
27.5%
k 859539
22.0%
u 570461
14.6%
i 506031
12.9%
n 435543
11.1%
d 111124
 
2.8%
e 103346
 
2.6%
t 60548
 
1.5%
o 55332
 
1.4%
a 53911
 
1.4%
Other values (15) 76928
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
S 856659
85.3%
F 101451
 
10.1%
P 11688
 
1.2%
B 9093
 
0.9%
R 8650
 
0.9%
A 6797
 
0.7%
T 3295
 
0.3%
H 2684
 
0.3%
O 1310
 
0.1%
M 940
 
0.1%
Other values (6) 1505
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 404417
99.2%
/ 3372
 
0.8%
, 4
 
< 0.1%
. 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
5 1
50.0%
6 1
50.0%
Space Separator
ValueCountFrequency (%)
437847
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4913139
85.3%
Common 845644
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 1076304
21.9%
k 859539
17.5%
S 856659
17.4%
u 570461
11.6%
i 506031
10.3%
n 435543
8.9%
d 111124
 
2.3%
e 103346
 
2.1%
F 101451
 
2.1%
t 60548
 
1.2%
Other values (31) 232133
 
4.7%
Common
ValueCountFrequency (%)
437847
51.8%
; 404417
47.8%
/ 3372
 
0.4%
, 4
 
< 0.1%
5 1
 
< 0.1%
. 1
 
< 0.1%
6 1
 
< 0.1%
+ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5758783
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 1076304
18.7%
k 859539
14.9%
S 856659
14.9%
u 570461
9.9%
i 506031
8.8%
437847
7.6%
n 435543
7.6%
; 404417
 
7.0%
d 111124
 
1.9%
e 103346
 
1.8%
Other values (39) 397512
 
6.9%

associatedSequences
Text

Missing 

Distinct1050
Distinct (%)99.6%
Missing600397
Missing (%)99.8%
Memory size4.6 MiB
2025-01-08T17:53:58.443846image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length699
Median length49
Mean length99.59108159
Min length47

Characters and Unicode

Total characters104969
Distinct characters58
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1046 ?
Unique (%)99.2%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=AY922964;https://www.ncbi.nlm.nih.gov/gquery?term=AY922875
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KC753815;https://www.ncbi.nlm.nih.gov/gquery?term=KC753933;https://www.ncbi.nlm.nih.gov/gquery?term=KC754042;https://www.ncbi.nlm.nih.gov/gquery?term=KC754162;https://www.ncbi.nlm.nih.gov/gquery?term=KC754280
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KC011508;https://www.ncbi.nlm.nih.gov/gquery?term=KC011594;https://www.ncbi.nlm.nih.gov/gquery?term=KC011682
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MN707485;https://www.ncbi.nlm.nih.gov/gquery?term=MN707432
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=JQ317640;https://www.ncbi.nlm.nih.gov/gquery?term=JQ317668
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=eu021073 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=fj383131 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=kx998919 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=eu021074 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=dq178333;https://www.ncbi.nlm.nih.gov/gquery?term=dq178344 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay974630;https://www.ncbi.nlm.nih.gov/gquery?term=ay974676 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kc753815;https://www.ncbi.nlm.nih.gov/gquery?term=kc753933;https://www.ncbi.nlm.nih.gov/gquery?term=kc754042;https://www.ncbi.nlm.nih.gov/gquery?term=kc754162;https://www.ncbi.nlm.nih.gov/gquery?term=kc754280 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kc011508;https://www.ncbi.nlm.nih.gov/gquery?term=kc011594;https://www.ncbi.nlm.nih.gov/gquery?term=kc011682 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mn707485;https://www.ncbi.nlm.nih.gov/gquery?term=mn707432 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jq317640;https://www.ncbi.nlm.nih.gov/gquery?term=jq317668 1
 
0.1%
Other values (1040) 1040
98.7%
2025-01-08T17:53:58.572027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 8515
 
8.1%
/ 6360
 
6.1%
w 6360
 
6.1%
n 6360
 
6.1%
t 6360
 
6.1%
h 4240
 
4.0%
r 4240
 
4.0%
e 4240
 
4.0%
i 4240
 
4.0%
m 4240
 
4.0%
Other values (48) 49814
47.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 65720
62.6%
Other Punctuation 20181
 
19.2%
Decimal Number 12730
 
12.1%
Uppercase Letter 4213
 
4.0%
Math Symbol 2120
 
2.0%
Connector Punctuation 5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 814
19.3%
M 721
17.1%
N 422
10.0%
Y 404
9.6%
A 392
9.3%
T 258
 
6.1%
F 237
 
5.6%
J 212
 
5.0%
C 171
 
4.1%
Q 146
 
3.5%
Other values (12) 436
10.3%
Lowercase Letter
ValueCountFrequency (%)
w 6360
 
9.7%
n 6360
 
9.7%
t 6360
 
9.7%
h 4240
 
6.5%
r 4240
 
6.5%
e 4240
 
6.5%
i 4240
 
6.5%
m 4240
 
6.5%
g 4240
 
6.5%
v 2120
 
3.2%
Other values (9) 19080
29.0%
Decimal Number
ValueCountFrequency (%)
7 1517
11.9%
3 1452
11.4%
6 1407
11.1%
9 1389
10.9%
2 1352
10.6%
4 1216
9.6%
8 1213
9.5%
1 1128
8.9%
5 1094
8.6%
0 962
7.6%
Other Punctuation
ValueCountFrequency (%)
. 8515
42.2%
/ 6360
31.5%
? 2120
 
10.5%
: 2120
 
10.5%
; 1066
 
5.3%
Math Symbol
ValueCountFrequency (%)
= 2120
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 69933
66.6%
Common 35036
33.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 6360
 
9.1%
n 6360
 
9.1%
t 6360
 
9.1%
h 4240
 
6.1%
r 4240
 
6.1%
e 4240
 
6.1%
i 4240
 
6.1%
m 4240
 
6.1%
g 4240
 
6.1%
v 2120
 
3.0%
Other values (31) 23293
33.3%
Common
ValueCountFrequency (%)
. 8515
24.3%
/ 6360
18.2%
? 2120
 
6.1%
: 2120
 
6.1%
= 2120
 
6.1%
7 1517
 
4.3%
3 1452
 
4.1%
6 1407
 
4.0%
9 1389
 
4.0%
2 1352
 
3.9%
Other values (7) 6684
19.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104969
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 8515
 
8.1%
/ 6360
 
6.1%
w 6360
 
6.1%
n 6360
 
6.1%
t 6360
 
6.1%
h 4240
 
4.0%
r 4240
 
4.0%
e 4240
 
4.0%
i 4240
 
4.0%
m 4240
 
4.0%
Other values (48) 49814
47.5%

occurrenceRemarks
Text

Missing 

Distinct5322
Distinct (%)49.3%
Missing590662
Missing (%)98.2%
Memory size4.6 MiB
2025-01-08T17:53:58.742207image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44804
Median length2082
Mean length214.0076003
Min length4

Characters and Unicode

Total characters2308928
Distinct characters158
Distinct categories18 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4721 ?
Unique (%)43.8%

Sample

1st rowFrom ledger catalogue 577876-577900: "field data recorded from field catalogues"
2nd rowSkin found in rotunda hallway hold-up case, 2017. May need tanning before installation into collection.
3rd rowLectotype designated by Avila Pires (1968:163).
4th rowSkull removed from alcoholic specimen.
5th rowMore than 800 dolphins stranded along a 220 km stretch pof the coast of Peru. See STR18239.; Broccetto, Marilia CNN website 22 IV 2012
ValueCountFrequency (%)
the 13880
 
3.8%
of 9359
 
2.6%
and 7684
 
2.1%
in 7077
 
1.9%
for 6435
 
1.8%
to 6041
 
1.6%
4896
 
1.3%
on 4761
 
1.3%
was 4231
 
1.2%
from 3875
 
1.1%
Other values (19019) 298259
81.4%
2025-01-08T17:53:58.977505image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
355709
15.4%
e 205843
 
8.9%
a 147185
 
6.4%
t 125245
 
5.4%
o 122482
 
5.3%
n 120296
 
5.2%
i 111994
 
4.9%
s 111800
 
4.8%
r 110930
 
4.8%
l 77896
 
3.4%
Other values (148) 819548
35.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1587531
68.8%
Space Separator 355709
 
15.4%
Uppercase Letter 132353
 
5.7%
Decimal Number 122350
 
5.3%
Other Punctuation 87540
 
3.8%
Dash Punctuation 8132
 
0.4%
Close Punctuation 6920
 
0.3%
Open Punctuation 6894
 
0.3%
Math Symbol 680
 
< 0.1%
Connector Punctuation 461
 
< 0.1%
Other values (8) 358
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 205843
13.0%
a 147185
 
9.3%
t 125245
 
7.9%
o 122482
 
7.7%
n 120296
 
7.6%
i 111994
 
7.1%
s 111800
 
7.0%
r 110930
 
7.0%
l 77896
 
4.9%
d 65194
 
4.1%
Other values (53) 388666
24.5%
Uppercase Letter
ValueCountFrequency (%)
S 13793
 
10.4%
M 11265
 
8.5%
N 10762
 
8.1%
T 10560
 
8.0%
C 8190
 
6.2%
F 7728
 
5.8%
I 7523
 
5.7%
A 7439
 
5.6%
B 6332
 
4.8%
R 5318
 
4.0%
Other values (18) 43443
32.8%
Other Punctuation
ValueCountFrequency (%)
. 36734
42.0%
, 26137
29.9%
: 6493
 
7.4%
" 5631
 
6.4%
; 4846
 
5.5%
/ 3229
 
3.7%
' 1865
 
2.1%
# 977
 
1.1%
& 535
 
0.6%
? 299
 
0.3%
Other values (12) 794
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 20642
16.9%
0 20306
16.6%
2 20036
16.4%
5 10174
8.3%
9 10101
8.3%
7 9447
7.7%
6 8256
 
6.7%
3 8246
 
6.7%
4 7859
 
6.4%
8 7283
 
6.0%
Math Symbol
ValueCountFrequency (%)
= 207
30.4%
+ 203
29.9%
~ 120
17.6%
< 79
 
11.6%
> 62
 
9.1%
| 4
 
0.6%
± 2
 
0.3%
¬ 2
 
0.3%
1
 
0.1%
Other Number
ValueCountFrequency (%)
½ 29
65.9%
¼ 7
 
15.9%
¹ 5
 
11.4%
¾ 3
 
6.8%
Dash Punctuation
ValueCountFrequency (%)
- 7459
91.7%
656
 
8.1%
17
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 6315
91.3%
] 602
 
8.7%
} 3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 6291
91.3%
[ 600
 
8.7%
{ 3
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
90
98.9%
» 1
 
1.1%
Currency Symbol
ValueCountFrequency (%)
$ 48
82.8%
¥ 10
 
17.2%
Format
ValueCountFrequency (%)
3
60.0%
 2
40.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
50.0%
^ 1
50.0%
Space Separator
ValueCountFrequency (%)
355709
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 461
100.0%
Initial Punctuation
ValueCountFrequency (%)
83
100.0%
Other Symbol
ValueCountFrequency (%)
° 67
100.0%
Other Letter
ValueCountFrequency (%)
º 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1719816
74.5%
Common 589036
 
25.5%
Greek 76
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 205843
 
12.0%
a 147185
 
8.6%
t 125245
 
7.3%
o 122482
 
7.1%
n 120296
 
7.0%
i 111994
 
6.5%
s 111800
 
6.5%
r 110930
 
6.5%
l 77896
 
4.5%
d 65194
 
3.8%
Other values (70) 520951
30.3%
Common
ValueCountFrequency (%)
355709
60.4%
. 36734
 
6.2%
, 26137
 
4.4%
1 20642
 
3.5%
0 20306
 
3.4%
2 20036
 
3.4%
5 10174
 
1.7%
9 10101
 
1.7%
7 9447
 
1.6%
6 8256
 
1.4%
Other values (56) 71494
 
12.1%
Greek
ValueCountFrequency (%)
μ 64
84.2%
ο 2
 
2.6%
ή 1
 
1.3%
ϊ 1
 
1.3%
ι 1
 
1.3%
ν 1
 
1.3%
ρ 1
 
1.3%
υ 1
 
1.3%
δ 1
 
1.3%
α 1
 
1.3%
Other values (2) 2
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2307432
99.9%
Punctuation 858
 
< 0.1%
None 637
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
355709
15.4%
e 205843
 
8.9%
a 147185
 
6.4%
t 125245
 
5.4%
o 122482
 
5.3%
n 120296
 
5.2%
i 111994
 
4.9%
s 111800
 
4.8%
r 110930
 
4.8%
l 77896
 
3.4%
Other values (84) 818052
35.5%
Punctuation
ValueCountFrequency (%)
656
76.5%
90
 
10.5%
83
 
9.7%
17
 
2.0%
4
 
0.5%
3
 
0.3%
2
 
0.2%
2
 
0.2%
1
 
0.1%
None
ValueCountFrequency (%)
· 170
26.7%
é 78
12.2%
° 67
 
10.5%
μ 64
 
10.0%
ì 58
 
9.1%
½ 29
 
4.6%
è 20
 
3.1%
Ö 12
 
1.9%
ä 10
 
1.6%
ü 10
 
1.6%
Other values (44) 119
18.7%
Math Operators
ValueCountFrequency (%)
1
100.0%

eventDate
Text

Missing 

Distinct46549
Distinct (%)8.1%
Missing28480
Missing (%)4.7%
Memory size4.6 MiB
2025-01-08T17:53:59.172507image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.72325999
Min length4

Characters and Unicode

Total characters5571146
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7620 ?
Unique (%)1.3%

Sample

1st row1989-02-28
2nd row1917-08-08
3rd row1966-05
4th row1894-07-15
5th row1992-11-05
ValueCountFrequency (%)
1968 1161
 
0.2%
1959 829
 
0.1%
1965-06 704
 
0.1%
1966-06-02 682
 
0.1%
1903 600
 
0.1%
1905 591
 
0.1%
1965 543
 
0.1%
1967-08 537
 
0.1%
1967-05 529
 
0.1%
1968-09-02 520
 
0.1%
Other values (46539) 566275
98.8%
2025-01-08T17:53:59.424545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1091809
19.6%
1 1091166
19.6%
0 832958
15.0%
9 716794
12.9%
2 391838
 
7.0%
6 323354
 
5.8%
8 308610
 
5.5%
7 251407
 
4.5%
3 195450
 
3.5%
5 191688
 
3.4%
Other values (2) 176072
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4478570
80.4%
Dash Punctuation 1091809
 
19.6%
Other Punctuation 767
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1091166
24.4%
0 832958
18.6%
9 716794
16.0%
2 391838
 
8.7%
6 323354
 
7.2%
8 308610
 
6.9%
7 251407
 
5.6%
3 195450
 
4.4%
5 191688
 
4.3%
4 175305
 
3.9%
Dash Punctuation
ValueCountFrequency (%)
- 1091809
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 767
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5571146
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1091809
19.6%
1 1091166
19.6%
0 832958
15.0%
9 716794
12.9%
2 391838
 
7.0%
6 323354
 
5.8%
8 308610
 
5.5%
7 251407
 
4.5%
3 195450
 
3.5%
5 191688
 
3.4%
Other values (2) 176072
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5571146
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1091809
19.6%
1 1091166
19.6%
0 832958
15.0%
9 716794
12.9%
2 391838
 
7.0%
6 323354
 
5.8%
8 308610
 
5.5%
7 251407
 
4.5%
3 195450
 
3.5%
5 191688
 
3.4%
Other values (2) 176072
 
3.2%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing67487
Missing (%)11.2%
Memory size4.6 MiB
2025-01-08T17:53:59.623938image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.721050483
Min length1

Characters and Unicode

Total characters1452943
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row59
2nd row220
3rd row196
4th row310
5th row77
ValueCountFrequency (%)
193 2428
 
0.5%
222 2369
 
0.4%
199 2342
 
0.4%
205 2305
 
0.4%
207 2235
 
0.4%
208 2179
 
0.4%
197 2151
 
0.4%
202 2126
 
0.4%
203 2117
 
0.4%
201 2091
 
0.4%
Other values (356) 511621
95.8%
2025-01-08T17:53:59.877792image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 275979
19.0%
2 269556
18.6%
3 182055
12.5%
5 109078
 
7.5%
4 108546
 
7.5%
6 105788
 
7.3%
7 102563
 
7.1%
9 100514
 
6.9%
0 99672
 
6.9%
8 99192
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1452943
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 275979
19.0%
2 269556
18.6%
3 182055
12.5%
5 109078
 
7.5%
4 108546
 
7.5%
6 105788
 
7.3%
7 102563
 
7.1%
9 100514
 
6.9%
0 99672
 
6.9%
8 99192
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 1452943
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 275979
19.0%
2 269556
18.6%
3 182055
12.5%
5 109078
 
7.5%
4 108546
 
7.5%
6 105788
 
7.3%
7 102563
 
7.1%
9 100514
 
6.9%
0 99672
 
6.9%
8 99192
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1452943
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 275979
19.0%
2 269556
18.6%
3 182055
12.5%
5 109078
 
7.5%
4 108546
 
7.5%
6 105788
 
7.3%
7 102563
 
7.1%
9 100514
 
6.9%
0 99672
 
6.9%
8 99192
 
6.8%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing67487
Missing (%)11.2%
Memory size4.6 MiB
2025-01-08T17:54:00.070354image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.721117903
Min length1

Characters and Unicode

Total characters1452979
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row59
2nd row220
3rd row196
4th row310
5th row77
ValueCountFrequency (%)
222 2369
 
0.4%
193 2355
 
0.4%
199 2343
 
0.4%
205 2304
 
0.4%
207 2253
 
0.4%
208 2179
 
0.4%
197 2150
 
0.4%
204 2149
 
0.4%
202 2125
 
0.4%
203 2117
 
0.4%
Other values (356) 511620
95.8%
2025-01-08T17:54:00.314330image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 275595
19.0%
2 269744
18.6%
3 182061
12.5%
5 109081
 
7.5%
4 108843
 
7.5%
6 105700
 
7.3%
7 102579
 
7.1%
9 100408
 
6.9%
0 99751
 
6.9%
8 99217
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1452979
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 275595
19.0%
2 269744
18.6%
3 182061
12.5%
5 109081
 
7.5%
4 108843
 
7.5%
6 105700
 
7.3%
7 102579
 
7.1%
9 100408
 
6.9%
0 99751
 
6.9%
8 99217
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 1452979
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 275595
19.0%
2 269744
18.6%
3 182061
12.5%
5 109081
 
7.5%
4 108843
 
7.5%
6 105700
 
7.3%
7 102579
 
7.1%
9 100408
 
6.9%
0 99751
 
6.9%
8 99217
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1452979
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 275595
19.0%
2 269744
18.6%
3 182061
12.5%
5 109081
 
7.5%
4 108843
 
7.5%
6 105700
 
7.3%
7 102579
 
7.1%
9 100408
 
6.9%
0 99751
 
6.9%
8 99217
 
6.8%

year
Text

Missing 

Distinct350
Distinct (%)0.1%
Missing28519
Missing (%)4.7%
Memory size4.6 MiB
2025-01-08T17:54:00.498359image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2291728
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)< 0.1%

Sample

1st row1989
2nd row1917
3rd row1966
4th row1894
5th row1992
ValueCountFrequency (%)
1967 30814
 
5.4%
1968 27037
 
4.7%
1966 22575
 
3.9%
1969 15259
 
2.7%
1965 12690
 
2.2%
1964 12541
 
2.2%
1962 11208
 
2.0%
1970 10525
 
1.8%
1916 9955
 
1.7%
1963 9798
 
1.7%
Other values (340) 410530
71.7%
2025-01-08T17:54:00.735520image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 669600
29.2%
9 621357
27.1%
6 214928
 
9.4%
8 199542
 
8.7%
7 134576
 
5.9%
0 132983
 
5.8%
5 87279
 
3.8%
2 86813
 
3.8%
4 76096
 
3.3%
3 68554
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2291728
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 669600
29.2%
9 621357
27.1%
6 214928
 
9.4%
8 199542
 
8.7%
7 134576
 
5.9%
0 132983
 
5.8%
5 87279
 
3.8%
2 86813
 
3.8%
4 76096
 
3.3%
3 68554
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2291728
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 669600
29.2%
9 621357
27.1%
6 214928
 
9.4%
8 199542
 
8.7%
7 134576
 
5.9%
0 132983
 
5.8%
5 87279
 
3.8%
2 86813
 
3.8%
4 76096
 
3.3%
3 68554
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2291728
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 669600
29.2%
9 621357
27.1%
6 214928
 
9.4%
8 199542
 
8.7%
7 134576
 
5.9%
0 132983
 
5.8%
5 87279
 
3.8%
2 86813
 
3.8%
4 76096
 
3.3%
3 68554
 
3.0%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing45368
Missing (%)7.5%
Memory size4.6 MiB
2025-01-08T17:54:00.797521image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.192809347
Min length1

Characters and Unicode

Total characters663301
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row8
3rd row5
4th row7
5th row11
ValueCountFrequency (%)
7 63530
11.4%
8 55595
10.0%
6 55446
10.0%
3 50980
9.2%
5 50113
9.0%
4 46748
8.4%
9 43982
7.9%
2 43057
7.7%
10 40456
7.3%
1 39414
7.1%
Other values (2) 66762
12.0%
2025-01-08T17:54:00.890971image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 181841
27.4%
2 74610
11.2%
7 63530
 
9.6%
8 55595
 
8.4%
6 55446
 
8.4%
3 50980
 
7.7%
5 50113
 
7.6%
4 46748
 
7.0%
9 43982
 
6.6%
0 40456
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 663301
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 181841
27.4%
2 74610
11.2%
7 63530
 
9.6%
8 55595
 
8.4%
6 55446
 
8.4%
3 50980
 
7.7%
5 50113
 
7.6%
4 46748
 
7.0%
9 43982
 
6.6%
0 40456
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 663301
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 181841
27.4%
2 74610
11.2%
7 63530
 
9.6%
8 55595
 
8.4%
6 55446
 
8.4%
3 50980
 
7.7%
5 50113
 
7.6%
4 46748
 
7.0%
9 43982
 
6.6%
0 40456
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 663301
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 181841
27.4%
2 74610
11.2%
7 63530
 
9.6%
8 55595
 
8.4%
6 55446
 
8.4%
3 50980
 
7.7%
5 50113
 
7.6%
4 46748
 
7.0%
9 43982
 
6.6%
0 40456
 
6.1%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing68254
Missing (%)11.3%
Memory size4.6 MiB
2025-01-08T17:54:00.957467image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.708122889
Min length1

Characters and Unicode

Total characters910766
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row28
2nd row8
3rd row15
4th row5
5th row18
ValueCountFrequency (%)
10 19183
 
3.6%
20 18565
 
3.5%
22 18462
 
3.5%
15 18330
 
3.4%
18 18186
 
3.4%
14 17946
 
3.4%
5 17919
 
3.4%
16 17902
 
3.4%
27 17827
 
3.3%
21 17778
 
3.3%
Other values (21) 351099
65.8%
2025-01-08T17:54:01.082510image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 237918
26.1%
2 229380
25.2%
3 75570
 
8.3%
5 53783
 
5.9%
0 53189
 
5.8%
8 53046
 
5.8%
7 52750
 
5.8%
6 52475
 
5.8%
4 52022
 
5.7%
9 50633
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 910766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 237918
26.1%
2 229380
25.2%
3 75570
 
8.3%
5 53783
 
5.9%
0 53189
 
5.8%
8 53046
 
5.8%
7 52750
 
5.8%
6 52475
 
5.8%
4 52022
 
5.7%
9 50633
 
5.6%

Most occurring scripts

ValueCountFrequency (%)
Common 910766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 237918
26.1%
2 229380
25.2%
3 75570
 
8.3%
5 53783
 
5.9%
0 53189
 
5.8%
8 53046
 
5.8%
7 52750
 
5.8%
6 52475
 
5.8%
4 52022
 
5.7%
9 50633
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 910766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 237918
26.1%
2 229380
25.2%
3 75570
 
8.3%
5 53783
 
5.9%
0 53189
 
5.8%
8 53046
 
5.8%
7 52750
 
5.8%
6 52475
 
5.8%
4 52022
 
5.7%
9 50633
 
5.6%

verbatimEventDate
Text

Missing 

Distinct45124
Distinct (%)8.0%
Missing36490
Missing (%)6.1%
Memory size4.6 MiB
2025-01-08T17:54:01.249981image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length82
Median length11
Mean length10.73425953
Min length3

Characters and Unicode

Total characters6064438
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7925 ?
Unique (%)1.4%

Sample

1st row28 Feb 1989
2nd row8 Aug 1917
3rd row-- May 1966
4th row15 Jul 1894
5th row5 Nov 1992
ValueCountFrequency (%)
119289
 
7.0%
jul 59029
 
3.5%
aug 52663
 
3.1%
jun 52253
 
3.1%
mar 49098
 
2.9%
may 47959
 
2.8%
apr 45015
 
2.6%
sep 41961
 
2.5%
feb 40432
 
2.4%
oct 39123
 
2.3%
Other values (873) 1153619
67.8%
2025-01-08T17:54:01.498664image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1135480
18.7%
1 869039
14.3%
9 644744
 
10.6%
2 290400
 
4.8%
- 284559
 
4.7%
6 256804
 
4.2%
8 242113
 
4.0%
7 176263
 
2.9%
u 165038
 
2.7%
0 163304
 
2.7%
Other values (65) 1836694
30.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3034227
50.0%
Space Separator 1135480
 
18.7%
Lowercase Letter 1072102
 
17.7%
Uppercase Letter 534667
 
8.8%
Dash Punctuation 284559
 
4.7%
Other Punctuation 3387
 
0.1%
Close Punctuation 7
 
< 0.1%
Open Punctuation 6
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 165038
15.4%
a 133875
12.5%
e 114602
10.7%
r 97161
9.1%
n 90730
8.5%
p 87414
8.2%
c 68763
6.4%
l 60684
 
5.7%
g 53357
 
5.0%
y 47929
 
4.5%
Other values (14) 152549
14.2%
Uppercase Letter
ValueCountFrequency (%)
J 147559
27.6%
A 97950
18.3%
M 97151
18.2%
S 43634
 
8.2%
F 41188
 
7.7%
O 39198
 
7.3%
N 33829
 
6.3%
D 30011
 
5.6%
W 1456
 
0.3%
E 615
 
0.1%
Other values (13) 2076
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 869039
28.6%
9 644744
21.2%
2 290400
 
9.6%
6 256804
 
8.5%
8 242113
 
8.0%
7 176263
 
5.8%
0 163304
 
5.4%
3 136893
 
4.5%
5 134478
 
4.4%
4 120189
 
4.0%
Other Punctuation
ValueCountFrequency (%)
* 2267
66.9%
, 926
27.3%
? 105
 
3.1%
: 53
 
1.6%
/ 21
 
0.6%
. 6
 
0.2%
' 5
 
0.1%
& 2
 
0.1%
; 2
 
0.1%
Math Symbol
ValueCountFrequency (%)
= 1
33.3%
< 1
33.3%
~ 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 6
85.7%
] 1
 
14.3%
Open Punctuation
ValueCountFrequency (%)
( 5
83.3%
[ 1
 
16.7%
Space Separator
ValueCountFrequency (%)
1135480
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 284559
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4457669
73.5%
Latin 1606769
 
26.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 165038
 
10.3%
J 147559
 
9.2%
a 133875
 
8.3%
e 114602
 
7.1%
A 97950
 
6.1%
r 97161
 
6.0%
M 97151
 
6.0%
n 90730
 
5.6%
p 87414
 
5.4%
c 68763
 
4.3%
Other values (37) 506526
31.5%
Common
ValueCountFrequency (%)
1135480
25.5%
1 869039
19.5%
9 644744
14.5%
2 290400
 
6.5%
- 284559
 
6.4%
6 256804
 
5.8%
8 242113
 
5.4%
7 176263
 
4.0%
0 163304
 
3.7%
3 136893
 
3.1%
Other values (18) 258070
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6064438
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1135480
18.7%
1 869039
14.3%
9 644744
 
10.6%
2 290400
 
4.8%
- 284559
 
4.7%
6 256804
 
4.2%
8 242113
 
4.0%
7 176263
 
2.9%
u 165038
 
2.7%
0 163304
 
2.7%
Other values (65) 1836694
30.3%

habitat
Text

Missing 

Distinct7512
Distinct (%)5.7%
Missing468915
Missing (%)78.0%
Memory size4.6 MiB
2025-01-08T17:54:01.681711image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1014
Median length694
Mean length27.3692808
Min length1

Characters and Unicode

Total characters3627415
Distinct characters86
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4415 ?
Unique (%)3.3%

Sample

1st rowEcological remarks by collector(s): yes
2nd rowPremontane very humid forest
3rd rowEcological remarks by collector(s): no
4th rowEcological remarks by collector(s): yes
5th rowCulvert
ValueCountFrequency (%)
by 49297
 
9.4%
ecological 48727
 
9.3%
remarks 48718
 
9.3%
collector(s 48716
 
9.3%
yes 41564
 
8.0%
forest 32139
 
6.2%
tropical 15058
 
2.9%
humid 14768
 
2.8%
no 7275
 
1.4%
in 6943
 
1.3%
Other values (3497) 208498
40.0%
2025-01-08T17:54:01.932717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
389167
 
10.7%
o 316538
 
8.7%
e 293307
 
8.1%
r 281112
 
7.7%
l 253946
 
7.0%
s 244547
 
6.7%
c 240040
 
6.6%
a 233816
 
6.4%
i 137021
 
3.8%
t 136017
 
3.7%
Other values (76) 1101904
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2931962
80.8%
Space Separator 389167
 
10.7%
Uppercase Letter 134371
 
3.7%
Other Punctuation 62424
 
1.7%
Open Punctuation 49723
 
1.4%
Close Punctuation 49712
 
1.4%
Decimal Number 6872
 
0.2%
Dash Punctuation 3142
 
0.1%
Math Symbol 40
 
< 0.1%
Final Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 316538
10.8%
e 293307
10.0%
r 281112
9.6%
l 253946
 
8.7%
s 244547
 
8.3%
c 240040
 
8.2%
a 233816
 
8.0%
i 137021
 
4.7%
t 136017
 
4.6%
y 117063
 
4.0%
Other values (16) 678555
23.1%
Uppercase Letter
ValueCountFrequency (%)
E 49837
37.1%
T 18330
 
13.6%
S 10045
 
7.5%
R 7675
 
5.7%
P 6589
 
4.9%
G 6219
 
4.6%
C 4362
 
3.2%
M 4095
 
3.0%
A 3747
 
2.8%
B 3506
 
2.6%
Other values (16) 19966
14.9%
Other Punctuation
ValueCountFrequency (%)
: 48943
78.4%
, 7291
 
11.7%
. 4022
 
6.4%
; 832
 
1.3%
" 403
 
0.6%
& 381
 
0.6%
/ 229
 
0.4%
? 145
 
0.2%
' 102
 
0.2%
# 62
 
0.1%
Other values (3) 14
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 2599
37.8%
1 1142
16.6%
2 872
 
12.7%
3 636
 
9.3%
5 469
 
6.8%
4 334
 
4.9%
8 251
 
3.7%
6 220
 
3.2%
7 185
 
2.7%
9 164
 
2.4%
Close Punctuation
ValueCountFrequency (%)
) 49366
99.3%
] 345
 
0.7%
} 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 33
82.5%
+ 5
 
12.5%
~ 2
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 49378
99.3%
[ 345
 
0.7%
Space Separator
ValueCountFrequency (%)
389167
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3142
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3066333
84.5%
Common 561082
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 316538
 
10.3%
e 293307
 
9.6%
r 281112
 
9.2%
l 253946
 
8.3%
s 244547
 
8.0%
c 240040
 
7.8%
a 233816
 
7.6%
i 137021
 
4.5%
t 136017
 
4.4%
y 117063
 
3.8%
Other values (42) 812926
26.5%
Common
ValueCountFrequency (%)
389167
69.4%
( 49378
 
8.8%
) 49366
 
8.8%
: 48943
 
8.7%
, 7291
 
1.3%
. 4022
 
0.7%
- 3142
 
0.6%
0 2599
 
0.5%
1 1142
 
0.2%
2 872
 
0.2%
Other values (24) 5160
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3627413
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
389167
 
10.7%
o 316538
 
8.7%
e 293307
 
8.1%
r 281112
 
7.7%
l 253946
 
7.0%
s 244547
 
6.7%
c 240040
 
6.6%
a 233816
 
6.4%
i 137021
 
3.8%
t 136017
 
3.7%
Other values (75) 1101902
30.4%
Punctuation
ValueCountFrequency (%)
2
100.0%
Distinct8925
Distinct (%)1.5%
Missing440
Missing (%)0.1%
Memory size4.6 MiB
2025-01-08T17:54:02.217893image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length146
Median length124
Mean length39.09340095
Min length4

Characters and Unicode

Total characters23495564
Distinct characters91
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3023 ?
Unique (%)0.5%

Sample

1st rowNorth America, Panama, Bocas Del Toro
2nd rowNorth America, United States, Utah
3rd rowSouth America, Venezuela, Bolivar
4th rowNorth America, Mexico, Oaxaca
5th rowNorth America, North Atlantic Ocean, United States, North Carolina, Carteret
ValueCountFrequency (%)
america 390243
 
12.4%
north 378352
 
12.1%
united 229925
 
7.3%
states 225212
 
7.2%
africa 111667
 
3.6%
south 90792
 
2.9%
county 80759
 
2.6%
asia 66157
 
2.1%
ocean 58408
 
1.9%
mexico 50692
 
1.6%
Other values (5566) 1452640
46.3%
2025-01-08T17:54:02.465124image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2533836
 
10.8%
a 2342309
 
10.0%
i 1683292
 
7.2%
t 1628350
 
6.9%
e 1586909
 
6.8%
r 1444280
 
6.1%
, 1372561
 
5.8%
o 1263879
 
5.4%
n 1236327
 
5.3%
c 879180
 
3.7%
Other values (81) 7524641
32.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16409922
69.8%
Uppercase Letter 3147373
 
13.4%
Space Separator 2533836
 
10.8%
Other Punctuation 1384733
 
5.9%
Dash Punctuation 19470
 
0.1%
Open Punctuation 106
 
< 0.1%
Close Punctuation 106
 
< 0.1%
Decimal Number 12
 
< 0.1%
Modifier Letter 5
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2342309
14.3%
i 1683292
10.3%
t 1628350
9.9%
e 1586909
9.7%
r 1444280
8.8%
o 1263879
7.7%
n 1236327
7.5%
c 879180
 
5.4%
s 644321
 
3.9%
h 637727
 
3.9%
Other values (35) 3063348
18.7%
Uppercase Letter
ValueCountFrequency (%)
A 694612
22.1%
N 456008
14.5%
S 407690
13.0%
U 266626
 
8.5%
C 259605
 
8.2%
M 141875
 
4.5%
P 124642
 
4.0%
O 99864
 
3.2%
B 97350
 
3.1%
T 70558
 
2.2%
Other values (17) 528543
16.8%
Other Punctuation
ValueCountFrequency (%)
, 1372561
99.1%
' 7365
 
0.5%
. 3951
 
0.3%
? 630
 
< 0.1%
* 122
 
< 0.1%
/ 103
 
< 0.1%
: 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 4
33.3%
2 4
33.3%
1 2
16.7%
0 1
 
8.3%
8 1
 
8.3%
Dash Punctuation
ValueCountFrequency (%)
- 19466
> 99.9%
4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2533836
100.0%
Open Punctuation
ValueCountFrequency (%)
( 106
100.0%
Close Punctuation
ValueCountFrequency (%)
) 106
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 5
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19557295
83.2%
Common 3938269
 
16.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2342309
 
12.0%
i 1683292
 
8.6%
t 1628350
 
8.3%
e 1586909
 
8.1%
r 1444280
 
7.4%
o 1263879
 
6.5%
n 1236327
 
6.3%
c 879180
 
4.5%
A 694612
 
3.6%
s 644321
 
3.3%
Other values (62) 6153836
31.5%
Common
ValueCountFrequency (%)
2533836
64.3%
, 1372561
34.9%
- 19466
 
0.5%
' 7365
 
0.2%
. 3951
 
0.1%
? 630
 
< 0.1%
* 122
 
< 0.1%
( 106
 
< 0.1%
) 106
 
< 0.1%
/ 103
 
< 0.1%
Other values (9) 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23494048
> 99.9%
None 1504
 
< 0.1%
Modifier Letters 5
 
< 0.1%
Punctuation 4
 
< 0.1%
Latin Ext Additional 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2533836
 
10.8%
a 2342309
 
10.0%
i 1683292
 
7.2%
t 1628350
 
6.9%
e 1586909
 
6.8%
r 1444280
 
6.1%
, 1372561
 
5.8%
o 1263879
 
5.4%
n 1236327
 
5.3%
c 879180
 
3.7%
Other values (59) 7523125
32.0%
None
ValueCountFrequency (%)
é 564
37.5%
ó 346
23.0%
ä 178
 
11.8%
í 176
 
11.7%
ê 104
 
6.9%
è 57
 
3.8%
ô 53
 
3.5%
ū 5
 
0.3%
ā 4
 
0.3%
Đ 3
 
0.2%
Other values (9) 14
 
0.9%
Modifier Letters
ValueCountFrequency (%)
ʻ 5
100.0%
Punctuation
ValueCountFrequency (%)
4
100.0%
Latin Ext Additional
ValueCountFrequency (%)
3
100.0%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing39181
Missing (%)6.5%
Memory size4.6 MiB
2025-01-08T17:54:02.527356image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length10.4674249
Min length4

Characters and Unicode

Total characters5885519
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowSOUTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 305548
54.3%
africa 100847
 
17.9%
south_america 70554
 
12.5%
asia 64472
 
11.5%
europe 13203
 
2.3%
oceania 7485
 
1.3%
antarctica 161
 
< 0.1%
2025-01-08T17:54:02.627129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1098295
18.7%
R 795861
13.5%
I 549067
9.3%
C 484756
8.2%
E 409993
 
7.0%
O 396790
 
6.7%
T 376424
 
6.4%
H 376102
 
6.4%
_ 376102
 
6.4%
M 376102
 
6.4%
Other values (5) 646027
11.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5509417
93.6%
Connector Punctuation 376102
 
6.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1098295
19.9%
R 795861
14.4%
I 549067
10.0%
C 484756
8.8%
E 409993
 
7.4%
O 396790
 
7.2%
T 376424
 
6.8%
H 376102
 
6.8%
M 376102
 
6.8%
N 313194
 
5.7%
Other values (4) 332833
 
6.0%
Connector Punctuation
ValueCountFrequency (%)
_ 376102
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5509417
93.6%
Common 376102
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1098295
19.9%
R 795861
14.4%
I 549067
10.0%
C 484756
8.8%
E 409993
 
7.4%
O 396790
 
7.2%
T 376424
 
6.8%
H 376102
 
6.8%
M 376102
 
6.8%
N 313194
 
5.7%
Other values (4) 332833
 
6.0%
Common
ValueCountFrequency (%)
_ 376102
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5885519
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1098295
18.7%
R 795861
13.5%
I 549067
9.3%
C 484756
8.2%
E 409993
 
7.0%
O 396790
 
6.7%
T 376424
 
6.4%
H 376102
 
6.4%
_ 376102
 
6.4%
M 376102
 
6.4%
Other values (5) 646027
11.0%

waterBody
Text

Missing 

Distinct1298
Distinct (%)2.1%
Missing539858
Missing (%)89.8%
Memory size4.6 MiB
2025-01-08T17:54:02.789208image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length79
Median length75
Mean length24.02534379
Min length6

Characters and Unicode

Total characters1479793
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique776 ?
Unique (%)1.3%

Sample

1st rowNorth Atlantic Ocean
2nd rowNorth Pacific Ocean, Bering Sea
3rd rowNorth Pacific Ocean
4th rowNorth Atlantic Ocean, Gulf Of Mexico
5th rowNorth Pacific Ocean
ValueCountFrequency (%)
ocean 58130
25.3%
north 49957
21.8%
atlantic 30063
13.1%
pacific 21536
 
9.4%
sea 8710
 
3.8%
of 8285
 
3.6%
gulf 7277
 
3.2%
mexico 6087
 
2.7%
south 3736
 
1.6%
indian 3443
 
1.5%
Other values (1047) 32100
14.0%
2025-01-08T17:54:03.025889image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
167731
11.3%
a 149650
 
10.1%
c 142458
 
9.6%
t 125319
 
8.5%
n 116971
 
7.9%
i 97425
 
6.6%
e 90274
 
6.1%
o 70318
 
4.8%
O 66128
 
4.5%
r 64946
 
4.4%
Other values (51) 388573
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1060630
71.7%
Uppercase Letter 228943
 
15.5%
Space Separator 167731
 
11.3%
Other Punctuation 22340
 
1.5%
Dash Punctuation 147
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 149650
14.1%
c 142458
13.4%
t 125319
11.8%
n 116971
11.0%
i 97425
9.2%
e 90274
8.5%
o 70318
6.6%
r 64946
6.1%
h 61407
5.8%
l 46029
 
4.3%
Other values (17) 95833
9.0%
Uppercase Letter
ValueCountFrequency (%)
O 66128
28.9%
N 50247
21.9%
A 32498
14.2%
P 22062
 
9.6%
S 16927
 
7.4%
G 7662
 
3.3%
C 7479
 
3.3%
M 7332
 
3.2%
B 7248
 
3.2%
I 3893
 
1.7%
Other values (15) 7467
 
3.3%
Other Punctuation
ValueCountFrequency (%)
, 22196
99.4%
? 67
 
0.3%
. 43
 
0.2%
' 33
 
0.1%
* 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
167731
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1289573
87.1%
Common 190220
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 149650
11.6%
c 142458
11.0%
t 125319
9.7%
n 116971
 
9.1%
i 97425
 
7.6%
e 90274
 
7.0%
o 70318
 
5.5%
O 66128
 
5.1%
r 64946
 
5.0%
h 61407
 
4.8%
Other values (42) 304677
23.6%
Common
ValueCountFrequency (%)
167731
88.2%
, 22196
 
11.7%
- 147
 
0.1%
? 67
 
< 0.1%
. 43
 
< 0.1%
' 33
 
< 0.1%
* 1
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1479792
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
167731
11.3%
a 149650
 
10.1%
c 142458
 
9.6%
t 125319
 
8.5%
n 116971
 
7.9%
i 97425
 
6.6%
e 90274
 
6.1%
o 70318
 
4.8%
O 66128
 
4.5%
r 64946
 
4.4%
Other values (50) 388572
26.3%
None
ValueCountFrequency (%)
ö 1
100.0%

islandGroup
Text

Missing 

Distinct68
Distinct (%)1.4%
Missing596682
Missing (%)99.2%
Memory size4.6 MiB
2025-01-08T17:54:03.099888image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length24
Mean length13.28538478
Min length8

Characters and Unicode

Total characters63358
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.4%

Sample

1st rowPribilof Islands
2nd rowPribilof Islands
3rd rowRyukyu Islands
4th rowPribilof Islands
5th rowBatan Islands
ValueCountFrequency (%)
islands 3374
40.8%
pribilof 1808
21.9%
moluccas 1194
 
14.4%
ryukyu 497
 
6.0%
babuyan 176
 
2.1%
channel 159
 
1.9%
batan 120
 
1.5%
nicobar 108
 
1.3%
bismarck 94
 
1.1%
yap 83
 
1.0%
Other values (66) 653
 
7.9%
2025-01-08T17:54:03.220358image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 8103
12.8%
l 6718
 
10.6%
a 6381
 
10.1%
n 4444
 
7.0%
i 4222
 
6.7%
d 3521
 
5.6%
3497
 
5.5%
I 3376
 
5.3%
o 3353
 
5.3%
c 2688
 
4.2%
Other values (36) 17055
26.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 51599
81.4%
Uppercase Letter 8262
 
13.0%
Space Separator 3497
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 8103
15.7%
l 6718
13.0%
a 6381
12.4%
n 4444
8.6%
i 4222
8.2%
d 3521
6.8%
o 3353
6.5%
c 2688
 
5.2%
u 2566
 
5.0%
r 2242
 
4.3%
Other values (14) 7361
14.3%
Uppercase Letter
ValueCountFrequency (%)
I 3376
40.9%
P 1814
22.0%
M 1235
 
14.9%
R 497
 
6.0%
B 412
 
5.0%
C 183
 
2.2%
S 153
 
1.9%
A 151
 
1.8%
N 122
 
1.5%
Y 83
 
1.0%
Other values (11) 236
 
2.9%
Space Separator
ValueCountFrequency (%)
3497
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59861
94.5%
Common 3497
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 8103
13.5%
l 6718
11.2%
a 6381
10.7%
n 4444
 
7.4%
i 4222
 
7.1%
d 3521
 
5.9%
I 3376
 
5.6%
o 3353
 
5.6%
c 2688
 
4.5%
u 2566
 
4.3%
Other values (35) 14489
24.2%
Common
ValueCountFrequency (%)
3497
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 63358
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 8103
12.8%
l 6718
 
10.6%
a 6381
 
10.1%
n 4444
 
7.0%
i 4222
 
6.7%
d 3521
 
5.6%
3497
 
5.5%
I 3376
 
5.3%
o 3353
 
5.3%
c 2688
 
4.2%
Other values (36) 17055
26.9%

island
Text

Missing 

Distinct345
Distinct (%)0.9%
Missing564842
Missing (%)93.9%
Memory size4.6 MiB
2025-01-08T17:54:03.387994image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length21
Mean length8.146903767
Min length1

Characters and Unicode

Total characters298250
Distinct characters57
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)0.3%

Sample

1st rowSt. Paul Island
2nd rowSt. Paul Island
3rd rowTrinidad
4th rowBorneo
5th rowCulion Island
ValueCountFrequency (%)
island 7184
14.8%
borneo 5932
 
12.2%
sumatra 3675
 
7.5%
luzon 3124
 
6.4%
java 3005
 
6.2%
celebes 2678
 
5.5%
trinidad 2605
 
5.4%
st 1818
 
3.7%
paul 1799
 
3.7%
honshu 1290
 
2.6%
Other values (366) 15576
32.0%
2025-01-08T17:54:03.613924image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 39564
13.3%
n 28846
 
9.7%
o 23778
 
8.0%
e 21049
 
7.1%
r 16512
 
5.5%
d 15796
 
5.3%
l 15656
 
5.2%
s 14538
 
4.9%
u 14063
 
4.7%
12077
 
4.0%
Other values (47) 96371
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 235808
79.1%
Uppercase Letter 48529
 
16.3%
Space Separator 12077
 
4.0%
Other Punctuation 1830
 
0.6%
Dash Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 39564
16.8%
n 28846
12.2%
o 23778
10.1%
e 21049
8.9%
r 16512
7.0%
d 15796
 
6.7%
l 15656
 
6.6%
s 14538
 
6.2%
u 14063
 
6.0%
i 11254
 
4.8%
Other values (16) 34752
14.7%
Uppercase Letter
ValueCountFrequency (%)
I 7839
16.2%
B 7137
14.7%
S 7123
14.7%
L 4203
8.7%
C 3825
7.9%
P 3689
7.6%
T 3258
6.7%
J 3022
 
6.2%
N 2160
 
4.5%
H 1664
 
3.4%
Other values (14) 4609
9.5%
Other Punctuation
ValueCountFrequency (%)
. 1817
99.3%
' 9
 
0.5%
? 2
 
0.1%
* 1
 
0.1%
, 1
 
0.1%
Space Separator
ValueCountFrequency (%)
12077
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 284337
95.3%
Common 13913
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 39564
13.9%
n 28846
 
10.1%
o 23778
 
8.4%
e 21049
 
7.4%
r 16512
 
5.8%
d 15796
 
5.6%
l 15656
 
5.5%
s 14538
 
5.1%
u 14063
 
4.9%
i 11254
 
4.0%
Other values (40) 83281
29.3%
Common
ValueCountFrequency (%)
12077
86.8%
. 1817
 
13.1%
' 9
 
0.1%
- 6
 
< 0.1%
? 2
 
< 0.1%
* 1
 
< 0.1%
, 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 298250
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 39564
13.3%
n 28846
 
9.7%
o 23778
 
8.0%
e 21049
 
7.1%
r 16512
 
5.5%
d 15796
 
5.3%
l 15656
 
5.2%
s 14538
 
4.9%
u 14063
 
4.7%
12077
 
4.0%
Other values (47) 96371
32.3%
Distinct221
Distinct (%)< 0.1%
Missing4662
Missing (%)0.8%
Memory size4.6 MiB
2025-01-08T17:54:03.780879image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1193578
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowPA
2nd rowUS
3rd rowVE
4th rowMX
5th rowUS
ValueCountFrequency (%)
us 226290
37.9%
mx 35569
 
6.0%
pa 25486
 
4.3%
ve 24981
 
4.2%
ca 19304
 
3.2%
co 16625
 
2.8%
id 14924
 
2.5%
zz 13450
 
2.3%
br 12246
 
2.1%
za 11853
 
2.0%
Other values (211) 196061
32.9%
2025-01-08T17:54:03.980782image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 237899
19.9%
U 236282
19.8%
A 75072
 
6.3%
M 63999
 
5.4%
C 56100
 
4.7%
E 50124
 
4.2%
P 48406
 
4.1%
Z 47388
 
4.0%
G 36679
 
3.1%
X 35577
 
3.0%
Other values (16) 306052
25.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1193578
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 237899
19.9%
U 236282
19.8%
A 75072
 
6.3%
M 63999
 
5.4%
C 56100
 
4.7%
E 50124
 
4.2%
P 48406
 
4.1%
Z 47388
 
4.0%
G 36679
 
3.1%
X 35577
 
3.0%
Other values (16) 306052
25.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 1193578
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 237899
19.9%
U 236282
19.8%
A 75072
 
6.3%
M 63999
 
5.4%
C 56100
 
4.7%
E 50124
 
4.2%
P 48406
 
4.1%
Z 47388
 
4.0%
G 36679
 
3.1%
X 35577
 
3.0%
Other values (16) 306052
25.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1193578
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 237899
19.9%
U 236282
19.8%
A 75072
 
6.3%
M 63999
 
5.4%
C 56100
 
4.7%
E 50124
 
4.2%
P 48406
 
4.1%
Z 47388
 
4.0%
G 36679
 
3.1%
X 35577
 
3.0%
Other values (16) 306052
25.6%

stateProvince
Text

Missing 

Distinct1750
Distinct (%)0.3%
Missing93954
Missing (%)15.6%
Memory size4.6 MiB
2025-01-08T17:54:04.152514image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length27
Mean length9.156487625
Min length1

Characters and Unicode

Total characters4646890
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique314 ?
Unique (%)0.1%

Sample

1st rowBocas Del Toro
2nd rowUtah
3rd rowBolivar
4th rowOaxaca
5th rowNorth Carolina
ValueCountFrequency (%)
california 37958
 
5.7%
new 18698
 
2.8%
alaska 18000
 
2.7%
oregon 15112
 
2.3%
province 15077
 
2.2%
arizona 13072
 
1.9%
virginia 12189
 
1.8%
washington 12057
 
1.8%
texas 11524
 
1.7%
mexico 9875
 
1.5%
Other values (1720) 507096
75.6%
2025-01-08T17:54:04.391829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 685721
14.8%
i 388351
 
8.4%
n 356516
 
7.7%
o 350614
 
7.5%
r 326855
 
7.0%
e 277944
 
6.0%
l 192295
 
4.1%
s 173201
 
3.7%
t 172374
 
3.7%
163161
 
3.5%
Other values (65) 1559858
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3782086
81.4%
Uppercase Letter 683335
 
14.7%
Space Separator 163161
 
3.5%
Dash Punctuation 15111
 
0.3%
Other Punctuation 3190
 
0.1%
Decimal Number 4
 
< 0.1%
Math Symbol 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 685721
18.1%
i 388351
10.3%
n 356516
9.4%
o 350614
9.3%
r 326855
8.6%
e 277944
 
7.3%
l 192295
 
5.1%
s 173201
 
4.6%
t 172374
 
4.6%
u 116650
 
3.1%
Other values (25) 741565
19.6%
Uppercase Letter
ValueCountFrequency (%)
C 96322
14.1%
A 66126
 
9.7%
N 63963
 
9.4%
M 54370
 
8.0%
S 44892
 
6.6%
T 39318
 
5.8%
P 37886
 
5.5%
B 35544
 
5.2%
W 30828
 
4.5%
O 27556
 
4.0%
Other values (16) 186530
27.3%
Other Punctuation
ValueCountFrequency (%)
' 2998
94.0%
? 159
 
5.0%
/ 21
 
0.7%
* 6
 
0.2%
. 5
 
0.2%
: 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
0 1
25.0%
8 1
25.0%
Space Separator
ValueCountFrequency (%)
163161
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15111
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4465421
96.1%
Common 181469
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 685721
15.4%
i 388351
 
8.7%
n 356516
 
8.0%
o 350614
 
7.9%
r 326855
 
7.3%
e 277944
 
6.2%
l 192295
 
4.3%
s 173201
 
3.9%
t 172374
 
3.9%
u 116650
 
2.6%
Other values (51) 1424900
31.9%
Common
ValueCountFrequency (%)
163161
89.9%
- 15111
 
8.3%
' 2998
 
1.7%
? 159
 
0.1%
/ 21
 
< 0.1%
* 6
 
< 0.1%
. 5
 
< 0.1%
1 2
 
< 0.1%
0 1
 
< 0.1%
: 1
 
< 0.1%
Other values (4) 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4645873
> 99.9%
None 1017
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 685721
14.8%
i 388351
 
8.4%
n 356516
 
7.7%
o 350614
 
7.5%
r 326855
 
7.0%
e 277944
 
6.0%
l 192295
 
4.1%
s 173201
 
3.7%
t 172374
 
3.7%
163161
 
3.5%
Other values (56) 1558841
33.6%
None
ValueCountFrequency (%)
é 367
36.1%
ó 346
34.0%
ä 178
17.5%
ê 92
 
9.0%
ô 30
 
2.9%
ç 1
 
0.1%
ã 1
 
0.1%
ō 1
 
0.1%
æ 1
 
0.1%

county
Text

Missing 

Distinct3194
Distinct (%)2.1%
Missing447402
Missing (%)74.4%
Memory size4.6 MiB
2025-01-08T17:54:04.574661image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length47
Median length27
Mean length13.46725393
Min length1

Characters and Unicode

Total characters2074617
Distinct characters79
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique663 ?
Unique (%)0.4%

Sample

1st rowCarteret
2nd rowCusco
3rd rowMonterey County
4th rowGalveston
5th rowTamana Ward
ValueCountFrequency (%)
county 80697
27.5%
district 13828
 
4.7%
islands 3705
 
1.3%
division 3460
 
1.2%
san 3315
 
1.1%
province 2619
 
0.9%
schoolcraft 2179
 
0.7%
mackenzie 1966
 
0.7%
lane 1935
 
0.7%
municipality 1862
 
0.6%
Other values (2969) 178313
60.7%
2025-01-08T17:54:04.809375image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 189818
 
9.1%
o 175404
 
8.5%
t 161467
 
7.8%
a 160330
 
7.7%
139830
 
6.7%
i 120188
 
5.8%
u 116014
 
5.6%
e 111686
 
5.4%
r 102364
 
4.9%
C 99007
 
4.8%
Other values (69) 698509
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1630734
78.6%
Uppercase Letter 298270
 
14.4%
Space Separator 139830
 
6.7%
Dash Punctuation 4189
 
0.2%
Other Punctuation 1555
 
0.1%
Close Punctuation 13
 
< 0.1%
Open Punctuation 13
 
< 0.1%
Decimal Number 8
 
< 0.1%
Modifier Letter 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 189818
11.6%
o 175404
10.8%
t 161467
9.9%
a 160330
9.8%
i 120188
 
7.4%
u 116014
 
7.1%
e 111686
 
6.8%
r 102364
 
6.3%
y 97639
 
6.0%
s 76836
 
4.7%
Other values (28) 318988
19.6%
Uppercase Letter
ValueCountFrequency (%)
C 99007
33.2%
D 27665
 
9.3%
S 18077
 
6.1%
M 17795
 
6.0%
B 15214
 
5.1%
P 13875
 
4.7%
A 12422
 
4.2%
L 11112
 
3.7%
G 10792
 
3.6%
W 8980
 
3.0%
Other values (17) 63331
21.2%
Other Punctuation
ValueCountFrequency (%)
' 1171
75.3%
. 192
 
12.3%
* 113
 
7.3%
? 56
 
3.6%
/ 21
 
1.4%
, 2
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4185
99.9%
4
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 4
50.0%
4 4
50.0%
Space Separator
ValueCountFrequency (%)
139830
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1929004
93.0%
Common 145613
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 189818
 
9.8%
o 175404
 
9.1%
t 161467
 
8.4%
a 160330
 
8.3%
i 120188
 
6.2%
u 116014
 
6.0%
e 111686
 
5.8%
r 102364
 
5.3%
C 99007
 
5.1%
y 97639
 
5.1%
Other values (55) 595087
30.8%
Common
ValueCountFrequency (%)
139830
96.0%
- 4185
 
2.9%
' 1171
 
0.8%
. 192
 
0.1%
* 113
 
0.1%
? 56
 
< 0.1%
/ 21
 
< 0.1%
) 13
 
< 0.1%
( 13
 
< 0.1%
ʻ 5
 
< 0.1%
Other values (4) 14
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2074120
> 99.9%
None 485
 
< 0.1%
Modifier Letters 5
 
< 0.1%
Punctuation 4
 
< 0.1%
Latin Ext Additional 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 189818
 
9.2%
o 175404
 
8.5%
t 161467
 
7.8%
a 160330
 
7.7%
139830
 
6.7%
i 120188
 
5.8%
u 116014
 
5.6%
e 111686
 
5.4%
r 102364
 
4.9%
C 99007
 
4.8%
Other values (54) 698012
33.7%
None
ValueCountFrequency (%)
é 197
40.6%
í 176
36.3%
è 57
 
11.8%
ô 23
 
4.7%
ê 12
 
2.5%
ū 5
 
1.0%
ā 4
 
0.8%
Đ 3
 
0.6%
ơ 3
 
0.6%
à 3
 
0.6%
Other values (2) 2
 
0.4%
Modifier Letters
ValueCountFrequency (%)
ʻ 5
100.0%
Punctuation
ValueCountFrequency (%)
4
100.0%
Latin Ext Additional
ValueCountFrequency (%)
3
100.0%

locality
Text

Missing 

Distinct86656
Distinct (%)15.3%
Missing35404
Missing (%)5.9%
Memory size4.6 MiB
2025-01-08T17:54:04.998705image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length294
Median length159
Mean length21.69044267
Min length1

Characters and Unicode

Total characters12277810
Distinct characters126
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52764 ?
Unique (%)9.3%

Sample

1st rowTierra Oscura, 3.5 Km S. Tiger Key
2nd rowUinta Forest, Currant Creek
3rd rowkm. 125, 85 Km SSE El Dorado
4th rowTotontepec
5th rowAtlantic Beach, Atlantic Beach, 1/2 Mi E Of Triple S Pier.
ValueCountFrequency (%)
km 82857
 
3.9%
mi 82389
 
3.8%
of 34259
 
1.6%
n 30440
 
1.4%
river 28140
 
1.3%
s 27057
 
1.3%
e 26413
 
1.2%
w 26172
 
1.2%
island 23296
 
1.1%
san 23251
 
1.1%
Other values (42744) 1760837
82.1%
2025-01-08T17:54:05.251462image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1579064
 
12.9%
a 1198874
 
9.8%
e 766623
 
6.2%
i 659790
 
5.4%
n 655819
 
5.3%
o 653029
 
5.3%
r 550116
 
4.5%
l 446951
 
3.6%
t 434393
 
3.5%
, 393002
 
3.2%
Other values (116) 4940149
40.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7761837
63.2%
Uppercase Letter 2026832
 
16.5%
Space Separator 1579064
 
12.9%
Other Punctuation 489421
 
4.0%
Decimal Number 361074
 
2.9%
Open Punctuation 19801
 
0.2%
Close Punctuation 19779
 
0.2%
Dash Punctuation 15950
 
0.1%
Math Symbol 3991
 
< 0.1%
Connector Punctuation 54
 
< 0.1%
Other values (3) 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1198874
15.4%
e 766623
9.9%
i 659790
 
8.5%
n 655819
 
8.4%
o 653029
 
8.4%
r 550116
 
7.1%
l 446951
 
5.8%
t 434393
 
5.6%
s 353920
 
4.6%
u 324066
 
4.2%
Other values (49) 1718256
22.1%
Uppercase Letter
ValueCountFrequency (%)
S 227459
 
11.2%
M 200123
 
9.9%
C 146981
 
7.3%
N 141674
 
7.0%
K 124320
 
6.1%
R 117187
 
5.8%
B 112739
 
5.6%
P 108902
 
5.4%
E 107630
 
5.3%
W 98674
 
4.9%
Other values (21) 641143
31.6%
Other Punctuation
ValueCountFrequency (%)
, 393002
80.3%
. 71840
 
14.7%
; 9568
 
2.0%
' 6996
 
1.4%
/ 2669
 
0.5%
: 2390
 
0.5%
" 1272
 
0.3%
? 612
 
0.1%
& 491
 
0.1%
# 388
 
0.1%
Other values (3) 193
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 75042
20.8%
2 56929
15.8%
5 50938
14.1%
0 37299
10.3%
3 35827
9.9%
4 29576
 
8.2%
6 25291
 
7.0%
8 19038
 
5.3%
7 17890
 
5.0%
9 13244
 
3.7%
Math Symbol
ValueCountFrequency (%)
= 3747
93.9%
+ 184
 
4.6%
~ 60
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 10013
50.6%
[ 9788
49.4%
Close Punctuation
ValueCountFrequency (%)
) 9993
50.5%
] 9786
49.5%
Space Separator
ValueCountFrequency (%)
1579064
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15950
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 54
100.0%
Other Symbol
ValueCountFrequency (%)
° 3
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Other Number
ValueCountFrequency (%)
¼ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9788654
79.7%
Common 2489141
 
20.3%
Cyrillic 15
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1198874
 
12.2%
e 766623
 
7.8%
i 659790
 
6.7%
n 655819
 
6.7%
o 653029
 
6.7%
r 550116
 
5.6%
l 446951
 
4.6%
t 434393
 
4.4%
s 353920
 
3.6%
u 324066
 
3.3%
Other values (68) 3745073
38.3%
Common
ValueCountFrequency (%)
1579064
63.4%
, 393002
 
15.8%
1 75042
 
3.0%
. 71840
 
2.9%
2 56929
 
2.3%
5 50938
 
2.0%
0 37299
 
1.5%
3 35827
 
1.4%
4 29576
 
1.2%
6 25291
 
1.0%
Other values (26) 134333
 
5.4%
Cyrillic
ValueCountFrequency (%)
л 3
20.0%
к 2
13.3%
т 1
 
6.7%
і 1
 
6.7%
ө 1
 
6.7%
ы 1
 
6.7%
а 1
 
6.7%
м 1
 
6.7%
н 1
 
6.7%
е 1
 
6.7%
Other values (2) 2
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12277174
> 99.9%
None 619
 
< 0.1%
Cyrillic 15
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1579064
 
12.9%
a 1198874
 
9.8%
e 766623
 
6.2%
i 659790
 
5.4%
n 655819
 
5.3%
o 653029
 
5.3%
r 550116
 
4.5%
l 446951
 
3.6%
t 434393
 
3.5%
, 393002
 
3.2%
Other values (75) 4939513
40.2%
None
ValueCountFrequency (%)
é 382
61.7%
è 107
 
17.3%
ø 19
 
3.1%
ñ 19
 
3.1%
á 11
 
1.8%
ö 11
 
1.8%
ã 7
 
1.1%
ü 7
 
1.1%
ó 7
 
1.1%
Œ 6
 
1.0%
Other values (18) 43
 
6.9%
Cyrillic
ValueCountFrequency (%)
л 3
20.0%
к 2
13.3%
т 1
 
6.7%
і 1
 
6.7%
ө 1
 
6.7%
ы 1
 
6.7%
а 1
 
6.7%
м 1
 
6.7%
н 1
 
6.7%
е 1
 
6.7%
Other values (2) 2
13.3%
Punctuation
ValueCountFrequency (%)
2
100.0%

verbatimElevation
Text

Missing 

Distinct29
Distinct (%)1.8%
Missing599861
Missing (%)99.7%
Memory size4.6 MiB
2025-01-08T17:54:05.317461image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length8
Mean length8.518867925
Min length2

Characters and Unicode

Total characters13545
Distinct characters43
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.6%

Sample

1st rowsea level
2nd rowsealevel
3rd rowsealevel
4th rowsealevel
5th rowsee Osgood 1909:214
ValueCountFrequency (%)
sealevel 1096
46.9%
sea 280
 
12.0%
level 277
 
11.9%
ft 143
 
6.1%
104
 
4.5%
100 81
 
3.5%
m 59
 
2.5%
near 32
 
1.4%
below 30
 
1.3%
3 28
 
1.2%
Other values (33) 206
 
8.8%
2025-01-08T17:54:05.427233image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4198
31.0%
l 2792
20.6%
a 1481
 
10.9%
s 1380
 
10.2%
v 1376
 
10.2%
746
 
5.5%
0 314
 
2.3%
t 156
 
1.2%
1 152
 
1.1%
f 143
 
1.1%
Other values (33) 807
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12018
88.7%
Space Separator 746
 
5.5%
Decimal Number 555
 
4.1%
Math Symbol 110
 
0.8%
Uppercase Letter 87
 
0.6%
Dash Punctuation 22
 
0.2%
Other Punctuation 5
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4198
34.9%
l 2792
23.2%
a 1481
 
12.3%
s 1380
 
11.5%
v 1376
 
11.4%
t 156
 
1.3%
f 143
 
1.2%
c 92
 
0.8%
m 62
 
0.5%
r 61
 
0.5%
Other values (12) 277
 
2.3%
Decimal Number
ValueCountFrequency (%)
0 314
56.6%
1 152
27.4%
3 52
 
9.4%
5 16
 
2.9%
2 10
 
1.8%
7 6
 
1.1%
9 3
 
0.5%
4 2
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
P 28
32.2%
G 28
32.2%
S 28
32.2%
M 1
 
1.1%
K 1
 
1.1%
O 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
: 2
40.0%
Space Separator
ValueCountFrequency (%)
746
100.0%
Math Symbol
ValueCountFrequency (%)
< 110
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12105
89.4%
Common 1440
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4198
34.7%
l 2792
23.1%
a 1481
 
12.2%
s 1380
 
11.4%
v 1376
 
11.4%
t 156
 
1.3%
f 143
 
1.2%
c 92
 
0.8%
m 62
 
0.5%
r 61
 
0.5%
Other values (18) 364
 
3.0%
Common
ValueCountFrequency (%)
746
51.8%
0 314
21.8%
1 152
 
10.6%
< 110
 
7.6%
3 52
 
3.6%
- 22
 
1.5%
5 16
 
1.1%
2 10
 
0.7%
7 6
 
0.4%
9 3
 
0.2%
Other values (5) 9
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13545
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4198
31.0%
l 2792
20.6%
a 1481
 
10.9%
s 1380
 
10.2%
v 1376
 
10.2%
746
 
5.5%
0 314
 
2.3%
t 156
 
1.2%
1 152
 
1.1%
f 143
 
1.1%
Other values (33) 807
 
6.0%

decimalLatitude
Text

Missing 

Distinct10276
Distinct (%)6.7%
Missing447917
Missing (%)74.5%
Memory size4.6 MiB
2025-01-08T17:54:05.610934image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length5.057648469
Min length3

Characters and Unicode

Total characters776521
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4988 ?
Unique (%)3.2%

Sample

1st row5.98
2nd row34.68
3rd row31.5011
4th row29.37
5th row34.4863
ValueCountFrequency (%)
5.3 1716
 
1.1%
2.78 1090
 
0.7%
5.67 1073
 
0.7%
0.88 979
 
0.6%
3.65 946
 
0.6%
8.83 814
 
0.5%
10.53 811
 
0.5%
3.17 798
 
0.5%
8.17 759
 
0.5%
7.32 742
 
0.5%
Other values (9288) 143806
93.7%
2025-01-08T17:54:05.858154image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 153534
19.8%
3 86948
11.2%
2 77296
10.0%
1 68806
8.9%
5 67724
8.7%
8 61654
7.9%
7 57999
 
7.5%
6 42953
 
5.5%
0 42475
 
5.5%
9 41167
 
5.3%
Other values (2) 75965
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 585108
75.3%
Other Punctuation 153534
 
19.8%
Dash Punctuation 37879
 
4.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 86948
14.9%
2 77296
13.2%
1 68806
11.8%
5 67724
11.6%
8 61654
10.5%
7 57999
9.9%
6 42953
7.3%
0 42475
7.3%
9 41167
7.0%
4 38086
6.5%
Other Punctuation
ValueCountFrequency (%)
. 153534
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 37879
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 776521
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 153534
19.8%
3 86948
11.2%
2 77296
10.0%
1 68806
8.9%
5 67724
8.7%
8 61654
7.9%
7 57999
 
7.5%
6 42953
 
5.5%
0 42475
 
5.5%
9 41167
 
5.3%
Other values (2) 75965
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 776521
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 153534
19.8%
3 86948
11.2%
2 77296
10.0%
1 68806
8.9%
5 67724
8.7%
8 61654
7.9%
7 57999
 
7.5%
6 42953
 
5.5%
0 42475
 
5.5%
9 41167
 
5.3%
Other values (2) 75965
9.8%

decimalLongitude
Text

Missing 

Distinct11872
Distinct (%)7.7%
Missing447917
Missing (%)74.5%
Memory size4.6 MiB
2025-01-08T17:54:06.038815image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.660244636
Min length3

Characters and Unicode

Total characters869040
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5896 ?
Unique (%)3.8%

Sample

1st row-61.43
2nd row-76.7
3rd row65.8453
4th row-94.82
5th row74.6026
ValueCountFrequency (%)
66.22 1723
 
1.1%
16.42 1090
 
0.7%
127.68 955
 
0.6%
0.2 930
 
0.6%
70.5 790
 
0.5%
71.95 739
 
0.5%
79.62 722
 
0.5%
0.22 681
 
0.4%
0.97 651
 
0.4%
66.18 629
 
0.4%
Other values (11081) 144624
94.2%
2025-01-08T17:54:06.278350image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 153534
17.7%
- 87899
10.1%
2 86728
10.0%
1 82470
9.5%
7 81082
9.3%
3 68918
7.9%
6 62220
7.2%
5 59306
 
6.8%
8 58821
 
6.8%
0 51094
 
5.9%
Other values (2) 76968
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 627607
72.2%
Other Punctuation 153534
 
17.7%
Dash Punctuation 87899
 
10.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 86728
13.8%
1 82470
13.1%
7 81082
12.9%
3 68918
11.0%
6 62220
9.9%
5 59306
9.4%
8 58821
9.4%
0 51094
8.1%
4 38863
6.2%
9 38105
6.1%
Other Punctuation
ValueCountFrequency (%)
. 153534
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 87899
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 869040
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 153534
17.7%
- 87899
10.1%
2 86728
10.0%
1 82470
9.5%
7 81082
9.3%
3 68918
7.9%
6 62220
7.2%
5 59306
 
6.8%
8 58821
 
6.8%
0 51094
 
5.9%
Other values (2) 76968
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 869040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 153534
17.7%
- 87899
10.1%
2 86728
10.0%
1 82470
9.5%
7 81082
9.3%
3 68918
7.9%
6 62220
7.2%
5 59306
 
6.8%
8 58821
 
6.8%
0 51094
 
5.9%
Other values (2) 76968
8.9%
Distinct4
Distinct (%)< 0.1%
Missing468202
Missing (%)77.8%
Memory size4.6 MiB
2025-01-08T17:54:06.332313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.96475771
Min length3

Characters and Unicode

Total characters3060031
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 133004
33.3%
minutes 133003
33.3%
seconds 133003
33.3%
utm 192
 
< 0.1%
unknown 53
 
< 0.1%
decimal 1
 
< 0.1%
2025-01-08T17:54:06.434753image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 665019
21.7%
s 399010
13.0%
n 266165
 
8.7%
266007
 
8.7%
M 133195
 
4.4%
o 133056
 
4.3%
D 133004
 
4.3%
c 133004
 
4.3%
g 133004
 
4.3%
r 133004
 
4.3%
Other values (12) 665563
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2394385
78.2%
Uppercase Letter 399639
 
13.1%
Space Separator 266007
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 665019
27.8%
s 399010
16.7%
n 266165
11.1%
o 133056
 
5.6%
c 133004
 
5.6%
g 133004
 
5.6%
r 133004
 
5.6%
i 133004
 
5.6%
d 133004
 
5.6%
t 133003
 
5.6%
Other values (6) 133112
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
M 133195
33.3%
D 133004
33.3%
S 133003
33.3%
U 245
 
0.1%
T 192
 
< 0.1%
Space Separator
ValueCountFrequency (%)
266007
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2794024
91.3%
Common 266007
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 665019
23.8%
s 399010
14.3%
n 266165
9.5%
M 133195
 
4.8%
o 133056
 
4.8%
D 133004
 
4.8%
c 133004
 
4.8%
g 133004
 
4.8%
r 133004
 
4.8%
i 133004
 
4.8%
Other values (11) 532559
19.1%
Common
ValueCountFrequency (%)
266007
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3060031
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 665019
21.7%
s 399010
13.0%
n 266165
 
8.7%
266007
 
8.7%
M 133195
 
4.4%
o 133056
 
4.3%
D 133004
 
4.3%
c 133004
 
4.3%
g 133004
 
4.3%
r 133004
 
4.3%
Other values (12) 665563
21.8%

georeferenceProtocol
Text

Missing 

Distinct8
Distinct (%)0.1%
Missing592196
Missing (%)98.5%
Memory size4.6 MiB
2025-01-08T17:54:06.483754image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length12
Mean length10.66731496
Min length3

Characters and Unicode

Total characters98726
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowGoogle Earth
2nd rowGoogle Earth
3rd rowGPS
4th rowGoogle Earth
5th rowGoogle Earth
ValueCountFrequency (%)
google 7074
41.5%
earth 7074
41.5%
gps 1418
 
8.3%
usgs 530
 
3.1%
topoview 530
 
3.1%
gazetteer 137
 
0.8%
atlas 42
 
0.2%
of 42
 
0.2%
canada 42
 
0.2%
42
 
0.2%
Other values (4) 96
 
0.6%
2025-01-08T17:54:06.589135image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 15334
15.5%
G 9159
9.3%
e 8096
8.2%
t 8000
8.1%
7772
7.9%
a 7479
7.6%
r 7294
7.4%
l 7116
7.2%
h 7076
7.2%
g 7074
7.2%
Other values (22) 14326
14.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 69543
70.4%
Uppercase Letter 21368
 
21.6%
Space Separator 7772
 
7.9%
Dash Punctuation 42
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 15334
22.0%
e 8096
11.6%
t 8000
11.5%
a 7479
10.8%
r 7294
10.5%
l 7116
10.2%
h 7076
10.2%
g 7074
10.2%
p 586
 
0.8%
w 530
 
0.8%
Other values (8) 958
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
G 9159
42.9%
E 7074
33.1%
S 2478
 
11.6%
P 1418
 
6.6%
V 530
 
2.5%
U 530
 
2.5%
A 42
 
0.2%
C 42
 
0.2%
T 42
 
0.2%
I 39
 
0.2%
Space Separator
ValueCountFrequency (%)
7772
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 90911
92.1%
Common 7815
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 15334
16.9%
G 9159
10.1%
e 8096
8.9%
t 8000
8.8%
a 7479
8.2%
r 7294
8.0%
l 7116
7.8%
h 7076
7.8%
g 7074
7.8%
E 7074
7.8%
Other values (19) 7209
7.9%
Common
ValueCountFrequency (%)
7772
99.4%
- 42
 
0.5%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 98726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 15334
15.5%
G 9159
9.3%
e 8096
8.2%
t 8000
8.1%
7772
7.9%
a 7479
7.6%
r 7294
7.4%
l 7116
7.2%
h 7076
7.2%
g 7074
7.2%
Other values (22) 14326
14.5%

georeferenceRemarks
Text

Missing 

Distinct8
Distinct (%)11.8%
Missing601383
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:54:06.648471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length35
Mean length31.20588235
Min length5

Characters and Unicode

Total characters2122
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)5.9%

Sample

1st rowGarmin Etrex Vista HCX, Datum WGS84
2nd rowGarmin Etrex Vista HCX, Datum WGS84
3rd rowGarmin Etrex Vista HCX, Datum WGS84
4th rowGarmin Etrex Vista HCX, Datum WGS84
5th rowGarmin Etrex Vista HCX, Datum WGS84
ValueCountFrequency (%)
garmin 54
15.1%
etrex 54
15.1%
vista 54
15.1%
hcx 54
15.1%
datum 54
15.1%
wgs84 54
15.1%
camp 7
 
2.0%
coordinates 7
 
2.0%
for 6
 
1.7%
longitude 2
 
0.6%
Other values (7) 12
 
3.4%
2025-01-08T17:54:06.759652image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
290
 
13.7%
a 184
 
8.7%
t 175
 
8.2%
r 132
 
6.2%
i 123
 
5.8%
m 118
 
5.6%
G 108
 
5.1%
e 73
 
3.4%
n 67
 
3.2%
s 62
 
2.9%
Other values (24) 790
37.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1118
52.7%
Uppercase Letter 551
26.0%
Space Separator 290
 
13.7%
Decimal Number 108
 
5.1%
Other Punctuation 55
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 184
16.5%
t 175
15.7%
r 132
11.8%
i 123
11.0%
m 118
10.6%
e 73
 
6.5%
n 67
 
6.0%
s 62
 
5.5%
u 58
 
5.2%
x 56
 
5.0%
Other values (8) 70
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
G 108
19.6%
C 61
11.1%
S 54
9.8%
W 54
9.8%
D 54
9.8%
X 54
9.8%
H 54
9.8%
V 54
9.8%
E 54
9.8%
L 2
 
0.4%
Decimal Number
ValueCountFrequency (%)
4 54
50.0%
8 54
50.0%
Other Punctuation
ValueCountFrequency (%)
, 54
98.2%
; 1
 
1.8%
Space Separator
ValueCountFrequency (%)
290
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1669
78.7%
Common 453
 
21.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 184
 
11.0%
t 175
 
10.5%
r 132
 
7.9%
i 123
 
7.4%
m 118
 
7.1%
G 108
 
6.5%
e 73
 
4.4%
n 67
 
4.0%
s 62
 
3.7%
C 61
 
3.7%
Other values (19) 566
33.9%
Common
ValueCountFrequency (%)
290
64.0%
4 54
 
11.9%
8 54
 
11.9%
, 54
 
11.9%
; 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2122
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
290
 
13.7%
a 184
 
8.7%
t 175
 
8.2%
r 132
 
6.2%
i 123
 
5.8%
m 118
 
5.6%
G 108
 
5.1%
e 73
 
3.4%
n 67
 
3.2%
s 62
 
2.9%
Other values (24) 790
37.2%
Distinct4
Distinct (%)0.3%
Missing599947
Missing (%)99.7%
Memory size4.6 MiB
2025-01-08T17:54:06.806082image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.412234043
Min length3

Characters and Unicode

Total characters12652
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowuncertain
2nd rowuncertain
3rd rowuncertain
4th rowuncertain
5th rowcf.
ValueCountFrequency (%)
uncertain 1355
90.0%
cf 147
 
9.8%
sp 2
 
0.1%
near 2
 
0.1%
2025-01-08T17:54:06.902608image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 2712
21.4%
c 1502
11.9%
e 1357
10.7%
r 1357
10.7%
a 1357
10.7%
t 1355
10.7%
i 1355
10.7%
u 1315
10.4%
. 149
 
1.2%
f 147
 
1.2%
Other values (4) 46
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12461
98.5%
Other Punctuation 149
 
1.2%
Uppercase Letter 40
 
0.3%
Space Separator 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2712
21.8%
c 1502
12.1%
e 1357
10.9%
r 1357
10.9%
a 1357
10.9%
t 1355
10.9%
i 1355
10.9%
u 1315
10.6%
f 147
 
1.2%
s 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 149
100.0%
Uppercase Letter
ValueCountFrequency (%)
U 40
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12501
98.8%
Common 151
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 2712
21.7%
c 1502
12.0%
e 1357
10.9%
r 1357
10.9%
a 1357
10.9%
t 1355
10.8%
i 1355
10.8%
u 1315
10.5%
f 147
 
1.2%
U 40
 
0.3%
Other values (2) 4
 
< 0.1%
Common
ValueCountFrequency (%)
. 149
98.7%
2
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12652
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 2712
21.4%
c 1502
11.9%
e 1357
10.7%
r 1357
10.7%
a 1357
10.7%
t 1355
10.7%
i 1355
10.7%
u 1315
10.4%
. 149
 
1.2%
f 147
 
1.2%
Other values (4) 46
 
0.4%

typeStatus
Text

Missing 

Distinct5
Distinct (%)0.1%
Missing597715
Missing (%)99.4%
Memory size4.6 MiB
2025-01-08T17:54:06.944261image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length4
Mean length4.176391863
Min length4

Characters and Unicode

Total characters15603
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLECTOTYPE
2nd rowTYPE
3rd rowTYPE
4th rowTYPE
5th rowTYPE
ValueCountFrequency (%)
type 3565
95.4%
syntype 80
 
2.1%
lectotype 67
 
1.8%
neotype 12
 
0.3%
holotype 12
 
0.3%
2025-01-08T17:54:07.036571image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 3816
24.5%
E 3815
24.5%
T 3803
24.4%
P 3736
23.9%
O 103
 
0.7%
N 92
 
0.6%
S 80
 
0.5%
L 79
 
0.5%
C 67
 
0.4%
H 12
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15603
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 3816
24.5%
E 3815
24.5%
T 3803
24.4%
P 3736
23.9%
O 103
 
0.7%
N 92
 
0.6%
S 80
 
0.5%
L 79
 
0.5%
C 67
 
0.4%
H 12
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 15603
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 3816
24.5%
E 3815
24.5%
T 3803
24.4%
P 3736
23.9%
O 103
 
0.7%
N 92
 
0.6%
S 80
 
0.5%
L 79
 
0.5%
C 67
 
0.4%
H 12
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15603
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 3816
24.5%
E 3815
24.5%
T 3803
24.4%
P 3736
23.9%
O 103
 
0.7%
N 92
 
0.6%
S 80
 
0.5%
L 79
 
0.5%
C 67
 
0.4%
H 12
 
0.1%

identifiedBy
Text

Missing 

Distinct95
Distinct (%)1.2%
Missing593267
Missing (%)98.6%
Memory size4.6 MiB
2025-01-08T17:54:07.198573image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length132
Median length124
Mean length94.36840176
Min length10

Characters and Unicode

Total characters772311
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)0.3%

Sample

1st rowO'Neill, Jennifer K., Fort Hayes State University
2nd rowGardner, Alfred L., Curator (USGS), United States Geological Survey (UNITED STATES)
3rd rowWoodman, Neal, (USGS), United States Geological Survey (UNITED STATES)
4th rowLunde, Darrin P., Collections Manager (MAM), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
5th rowReeder, DeeAnn M., Bucknell University (UNITED STATES)
ValueCountFrequency (%)
states 8033
 
7.9%
united 8033
 
7.9%
of 5420
 
5.3%
museum 5255
 
5.2%
natural 5077
 
5.0%
history 5077
 
5.0%
national 5064
 
5.0%
smithsonian 5007
 
4.9%
institution 5007
 
4.9%
4859
 
4.8%
Other values (272) 44753
44.1%
2025-01-08T17:54:07.436701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
93401
 
12.1%
t 49895
 
6.5%
o 47659
 
6.2%
i 45409
 
5.9%
a 41696
 
5.4%
e 39504
 
5.1%
n 38647
 
5.0%
s 36580
 
4.7%
r 29451
 
3.8%
u 25575
 
3.3%
Other values (48) 324494
42.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 444828
57.6%
Uppercase Letter 174361
 
22.6%
Space Separator 93401
 
12.1%
Other Punctuation 28613
 
3.7%
Open Punctuation 13070
 
1.7%
Close Punctuation 13070
 
1.7%
Dash Punctuation 4968
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 49895
11.2%
o 47659
10.7%
i 45409
10.2%
a 41696
9.4%
e 39504
8.9%
n 38647
8.7%
s 36580
8.2%
r 29451
6.6%
u 25575
 
5.7%
l 25043
 
5.6%
Other values (15) 65369
14.7%
Uppercase Letter
ValueCountFrequency (%)
S 24263
13.9%
T 20751
11.9%
M 20290
11.6%
N 18192
10.4%
E 15013
8.6%
A 13295
7.6%
I 12131
7.0%
U 9985
5.7%
D 9079
 
5.2%
H 8379
 
4.8%
Other values (14) 22983
13.2%
Other Punctuation
ValueCountFrequency (%)
, 22186
77.5%
. 6353
 
22.2%
' 69
 
0.2%
; 4
 
< 0.1%
& 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
93401
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13070
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13070
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4968
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 619189
80.2%
Common 153122
 
19.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 49895
 
8.1%
o 47659
 
7.7%
i 45409
 
7.3%
a 41696
 
6.7%
e 39504
 
6.4%
n 38647
 
6.2%
s 36580
 
5.9%
r 29451
 
4.8%
u 25575
 
4.1%
l 25043
 
4.0%
Other values (39) 239730
38.7%
Common
ValueCountFrequency (%)
93401
61.0%
, 22186
 
14.5%
( 13070
 
8.5%
) 13070
 
8.5%
. 6353
 
4.1%
- 4968
 
3.2%
' 69
 
< 0.1%
; 4
 
< 0.1%
& 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 772311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
93401
 
12.1%
t 49895
 
6.5%
o 47659
 
6.2%
i 45409
 
5.9%
a 41696
 
5.4%
e 39504
 
5.1%
n 38647
 
5.0%
s 36580
 
4.7%
r 29451
 
3.8%
u 25575
 
3.3%
Other values (48) 324494
42.0%
Distinct6815
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:07.633005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.999645856
Min length2

Characters and Unicode

Total characters4209944
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique793 ?
Unique (%)0.1%

Sample

1st row2433573
2nd row2438621
3rd row2433177
4th row2438034
5th row2440447
ValueCountFrequency (%)
2437967 14724
 
2.4%
2440447 11867
 
2.0%
2438904 8874
 
1.5%
2433176 8329
 
1.4%
2438019 7347
 
1.2%
2438655 6840
 
1.1%
2433272 5470
 
0.9%
2439270 5412
 
0.9%
2437782 5206
 
0.9%
4264939 4687
 
0.8%
Other values (6805) 522695
86.9%
2025-01-08T17:54:07.882457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4209944
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 4209944
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4209944
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%
Distinct7326
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:08.071255image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length147
Median length72
Mean length35.02832483
Min length7

Characters and Unicode

Total characters21067821
Distinct characters80
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique849 ?
Unique (%)0.1%

Sample

1st rowPotos flavus (Schreber, 1774)
2nd rowMicrotus longicaudus longicaudus
3rd rowCarollia brevicaudum (Schinz, 1821)
4th rowPeromyscus mexicanus totontepecus Merriam, 1898
5th rowTursiops truncatus (Montagu, 1821)
ValueCountFrequency (%)
linnaeus 52995
 
2.1%
1758 48641
 
1.9%
thomas 44736
 
1.8%
peromyscus 38753
 
1.5%
merriam 29181
 
1.2%
25993
 
1.0%
rattus 21929
 
0.9%
1821 21801
 
0.9%
microtus 19877
 
0.8%
j.a.allen 18118
 
0.7%
Other values (6496) 2183955
87.1%
2025-01-08T17:54:08.438807image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1904528
 
9.0%
s 1657434
 
7.9%
a 1400545
 
6.6%
i 1320831
 
6.3%
e 1211612
 
5.8%
r 1087984
 
5.2%
u 1063056
 
5.0%
o 1056373
 
5.0%
n 919028
 
4.4%
l 817723
 
3.9%
Other values (70) 8628707
41.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14532612
69.0%
Decimal Number 2105528
 
10.0%
Space Separator 1904528
 
9.0%
Uppercase Letter 1252548
 
5.9%
Other Punctuation 638107
 
3.0%
Close Punctuation 313875
 
1.5%
Open Punctuation 313875
 
1.5%
Dash Punctuation 6748
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1657434
11.4%
a 1400545
9.6%
i 1320831
9.1%
e 1211612
 
8.3%
r 1087984
 
7.5%
u 1063056
 
7.3%
o 1056373
 
7.3%
n 919028
 
6.3%
l 817723
 
5.6%
t 702630
 
4.8%
Other values (24) 3295396
22.7%
Uppercase Letter
ValueCountFrequency (%)
M 175044
14.0%
P 119561
9.5%
T 111746
 
8.9%
S 105807
 
8.4%
L 95529
 
7.6%
A 94300
 
7.5%
G 80592
 
6.4%
C 73937
 
5.9%
B 55966
 
4.5%
R 50699
 
4.0%
Other values (18) 289367
23.1%
Decimal Number
ValueCountFrequency (%)
1 644905
30.6%
8 441758
21.0%
9 227519
 
10.8%
7 162226
 
7.7%
5 138662
 
6.6%
0 127816
 
6.1%
4 100538
 
4.8%
3 96034
 
4.6%
2 89296
 
4.2%
6 76774
 
3.6%
Other Punctuation
ValueCountFrequency (%)
, 528229
82.8%
. 83107
 
13.0%
& 25993
 
4.1%
' 778
 
0.1%
Space Separator
ValueCountFrequency (%)
1904528
100.0%
Close Punctuation
ValueCountFrequency (%)
) 313875
100.0%
Open Punctuation
ValueCountFrequency (%)
( 313875
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6748
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15785160
74.9%
Common 5282661
 
25.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1657434
 
10.5%
a 1400545
 
8.9%
i 1320831
 
8.4%
e 1211612
 
7.7%
r 1087984
 
6.9%
u 1063056
 
6.7%
o 1056373
 
6.7%
n 919028
 
5.8%
l 817723
 
5.2%
t 702630
 
4.5%
Other values (52) 4547944
28.8%
Common
ValueCountFrequency (%)
1904528
36.1%
1 644905
 
12.2%
, 528229
 
10.0%
8 441758
 
8.4%
) 313875
 
5.9%
( 313875
 
5.9%
9 227519
 
4.3%
7 162226
 
3.1%
5 138662
 
2.6%
0 127816
 
2.4%
Other values (8) 479268
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21054857
99.9%
None 12964
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1904528
 
9.0%
s 1657434
 
7.9%
a 1400545
 
6.7%
i 1320831
 
6.3%
e 1211612
 
5.8%
r 1087984
 
5.2%
u 1063056
 
5.0%
o 1056373
 
5.0%
n 919028
 
4.4%
l 817723
 
3.9%
Other values (60) 8615743
40.9%
None
ValueCountFrequency (%)
ü 5095
39.3%
É 4244
32.7%
é 1615
 
12.5%
è 1387
 
10.7%
ö 331
 
2.6%
á 96
 
0.7%
ñ 78
 
0.6%
í 70
 
0.5%
Ä 24
 
0.2%
ä 24
 
0.2%
Distinct253
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:54:08.587554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length121
Median length113
Mean length90.64064651
Min length11

Characters and Unicode

Total characters54515273
Distinct characters48
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Carnivora, Caniformia, Procyonidae
2nd rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Rodentia, Myomorpha, Cricetidae, Arvicolinae
3rd rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Chiroptera, Phyllostomidae, Carolliinae
4th rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Rodentia, Myomorpha, Cricetidae, Neotominae
5th rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Cetacea, Odontoceti, Delphinidae
ValueCountFrequency (%)
animalia 601442
11.9%
vertebrata 601442
11.9%
chordata 601442
11.9%
mammalia 601441
11.9%
eutheria 593341
11.7%
rodentia 297636
 
5.9%
myomorpha 209417
 
4.1%
chiroptera 129086
 
2.5%
cricetidae 107243
 
2.1%
muridae 93911
 
1.9%
Other values (328) 1234181
24.3%
2025-01-08T17:54:08.801304image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8383067
15.4%
i 4797600
 
8.8%
, 4469138
 
8.2%
4469138
 
8.2%
e 4068524
 
7.5%
r 4037606
 
7.4%
t 3533330
 
6.5%
o 2704288
 
5.0%
m 2453478
 
4.5%
h 1861673
 
3.4%
Other values (38) 13737431
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40506415
74.3%
Uppercase Letter 5070582
 
9.3%
Other Punctuation 4469138
 
8.2%
Space Separator 4469138
 
8.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8383067
20.7%
i 4797600
11.8%
e 4068524
10.0%
r 4037606
10.0%
t 3533330
8.7%
o 2704288
 
6.7%
m 2453478
 
6.1%
h 1861673
 
4.6%
n 1678993
 
4.1%
l 1675363
 
4.1%
Other values (14) 5312493
13.1%
Uppercase Letter
ValueCountFrequency (%)
C 1067853
21.1%
M 1065447
21.0%
A 654487
12.9%
V 641586
12.7%
E 615945
12.1%
R 302881
 
6.0%
S 237180
 
4.7%
P 112443
 
2.2%
D 65158
 
1.3%
N 62146
 
1.2%
Other values (12) 245456
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 4469138
100.0%
Space Separator
ValueCountFrequency (%)
4469138
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45576997
83.6%
Common 8938276
 
16.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8383067
18.4%
i 4797600
10.5%
e 4068524
 
8.9%
r 4037606
 
8.9%
t 3533330
 
7.8%
o 2704288
 
5.9%
m 2453478
 
5.4%
h 1861673
 
4.1%
n 1678993
 
3.7%
l 1675363
 
3.7%
Other values (36) 10383075
22.8%
Common
ValueCountFrequency (%)
, 4469138
50.0%
4469138
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54515273
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8383067
15.4%
i 4797600
 
8.8%
, 4469138
 
8.2%
4469138
 
8.2%
e 4068524
 
7.5%
r 4037606
 
7.4%
t 3533330
 
6.5%
o 2704288
 
5.0%
m 2453478
 
4.5%
h 1861673
 
3.4%
Other values (38) 13737431
25.2%

kingdom
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:08.848213image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4811608
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 601451
100.0%
2025-01-08T17:54:08.935417image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1202902
25.0%
a 1202902
25.0%
A 601451
12.5%
n 601451
12.5%
m 601451
12.5%
l 601451
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4210157
87.5%
Uppercase Letter 601451
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1202902
28.6%
a 1202902
28.6%
n 601451
14.3%
m 601451
14.3%
l 601451
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 601451
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4811608
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1202902
25.0%
a 1202902
25.0%
A 601451
12.5%
n 601451
12.5%
m 601451
12.5%
l 601451
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4811608
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1202902
25.0%
a 1202902
25.0%
A 601451
12.5%
n 601451
12.5%
m 601451
12.5%
l 601451
12.5%

phylum
Text

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:08.975046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4811608
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 601449
> 99.9%
mollusca 2
 
< 0.1%
2025-01-08T17:54:09.065310image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1202900
25.0%
o 601451
12.5%
C 601449
12.5%
h 601449
12.5%
r 601449
12.5%
d 601449
12.5%
t 601449
12.5%
l 4
 
< 0.1%
M 2
 
< 0.1%
u 2
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4210157
87.5%
Uppercase Letter 601451
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1202900
28.6%
o 601451
14.3%
h 601449
14.3%
r 601449
14.3%
d 601449
14.3%
t 601449
14.3%
l 4
 
< 0.1%
u 2
 
< 0.1%
s 2
 
< 0.1%
c 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
C 601449
> 99.9%
M 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4811608
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1202900
25.0%
o 601451
12.5%
C 601449
12.5%
h 601449
12.5%
r 601449
12.5%
d 601449
12.5%
t 601449
12.5%
l 4
 
< 0.1%
M 2
 
< 0.1%
u 2
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4811608
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1202900
25.0%
o 601451
12.5%
C 601449
12.5%
h 601449
12.5%
r 601449
12.5%
d 601449
12.5%
t 601449
12.5%
l 4
 
< 0.1%
M 2
 
< 0.1%
u 2
 
< 0.1%
Other values (2) 4
 
< 0.1%

class
Text

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:54:09.104360image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length8.000006651
Min length8

Characters and Unicode

Total characters4811604
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMammalia
2nd rowMammalia
3rd rowMammalia
4th rowMammalia
5th rowMammalia
ValueCountFrequency (%)
mammalia 601448
> 99.9%
gastropoda 2
 
< 0.1%
2025-01-08T17:54:09.202540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1804348
37.5%
m 1202896
25.0%
M 601448
 
12.5%
l 601448
 
12.5%
i 601448
 
12.5%
o 4
 
< 0.1%
G 2
 
< 0.1%
s 2
 
< 0.1%
t 2
 
< 0.1%
r 2
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4210154
87.5%
Uppercase Letter 601450
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1804348
42.9%
m 1202896
28.6%
l 601448
 
14.3%
i 601448
 
14.3%
o 4
 
< 0.1%
s 2
 
< 0.1%
t 2
 
< 0.1%
r 2
 
< 0.1%
p 2
 
< 0.1%
d 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
M 601448
> 99.9%
G 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4811604
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1804348
37.5%
m 1202896
25.0%
M 601448
 
12.5%
l 601448
 
12.5%
i 601448
 
12.5%
o 4
 
< 0.1%
G 2
 
< 0.1%
s 2
 
< 0.1%
t 2
 
< 0.1%
r 2
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4811604
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1804348
37.5%
m 1202896
25.0%
M 601448
 
12.5%
l 601448
 
12.5%
i 601448
 
12.5%
o 4
 
< 0.1%
G 2
 
< 0.1%
s 2
 
< 0.1%
t 2
 
< 0.1%
r 2
 
< 0.1%
Other values (2) 4
 
< 0.1%

order
Text

Distinct29
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:54:09.259229image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length8
Mean length8.868951264
Min length6

Characters and Unicode

Total characters5334213
Distinct characters32
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCarnivora
2nd rowRodentia
3rd rowChiroptera
4th rowRodentia
5th rowCetacea
ValueCountFrequency (%)
rodentia 297636
49.5%
chiroptera 129084
21.5%
cetacea 47588
 
7.9%
carnivora 47294
 
7.9%
soricomorpha 30383
 
5.1%
lagomorpha 11977
 
2.0%
artiodactyla 11375
 
1.9%
primates 10781
 
1.8%
didelphimorphia 5645
 
0.9%
diprotodontia 1652
 
0.3%
Other values (19) 8033
 
1.3%
2025-01-08T17:54:09.370116image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 725656
13.6%
o 618974
11.6%
i 555654
10.4%
e 546103
10.2%
t 514236
9.6%
r 462546
8.7%
n 351518
6.6%
d 320914
6.0%
R 297636
5.6%
C 224385
 
4.2%
Other values (22) 716591
13.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4732765
88.7%
Uppercase Letter 601448
 
11.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 725656
15.3%
o 618974
13.1%
i 555654
11.7%
e 546103
11.5%
t 514236
10.9%
r 462546
9.8%
n 351518
7.4%
d 320914
6.8%
p 186078
 
3.9%
h 184415
 
3.9%
Other values (10) 266671
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
R 297636
49.5%
C 224385
37.3%
S 32307
 
5.4%
P 12509
 
2.1%
A 12049
 
2.0%
L 11977
 
2.0%
D 7786
 
1.3%
M 1503
 
0.2%
E 940
 
0.2%
H 341
 
0.1%
Other values (2) 15
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 5334213
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 725656
13.6%
o 618974
11.6%
i 555654
10.4%
e 546103
10.2%
t 514236
9.6%
r 462546
8.7%
n 351518
6.6%
d 320914
6.0%
R 297636
5.6%
C 224385
 
4.2%
Other values (22) 716591
13.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5334213
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 725656
13.6%
o 618974
11.6%
i 555654
10.4%
e 546103
10.2%
t 514236
9.6%
r 462546
8.7%
n 351518
6.6%
d 320914
6.0%
R 297636
5.6%
C 224385
 
4.2%
Other values (22) 716591
13.4%

family
Text

Distinct158
Distinct (%)< 0.1%
Missing1158
Missing (%)0.2%
Memory size4.6 MiB
2025-01-08T17:54:09.509603image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length16
Mean length10.24363436
Min length6

Characters and Unicode

Total characters6149182
Distinct characters42
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowProcyonidae
2nd rowCricetidae
3rd rowPhyllostomidae
4th rowCricetidae
5th rowDelphinidae
ValueCountFrequency (%)
cricetidae 107243
17.9%
muridae 93911
15.6%
phyllostomidae 55530
 
9.3%
sciuridae 46130
 
7.7%
soricidae 27470
 
4.6%
delphinidae 23642
 
3.9%
vespertilionidae 22260
 
3.7%
heteromyidae 19997
 
3.3%
molossidae 13560
 
2.3%
canidae 12559
 
2.1%
Other values (148) 177991
29.7%
2025-01-08T17:54:09.717628image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 944314
15.4%
i 915778
14.9%
a 664057
10.8%
d 635413
10.3%
r 412407
 
6.7%
o 362011
 
5.9%
t 276360
 
4.5%
l 229210
 
3.7%
c 221440
 
3.6%
u 159982
 
2.6%
Other values (32) 1328210
21.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5548889
90.2%
Uppercase Letter 600293
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 944314
17.0%
i 915778
16.5%
a 664057
12.0%
d 635413
11.5%
r 412407
7.4%
o 362011
 
6.5%
t 276360
 
5.0%
l 229210
 
4.1%
c 221440
 
4.0%
u 159982
 
2.9%
Other values (12) 727917
13.1%
Uppercase Letter
ValueCountFrequency (%)
C 134036
22.3%
M 133645
22.3%
P 81100
13.5%
S 74875
12.5%
D 35147
 
5.9%
H 30149
 
5.0%
V 23469
 
3.9%
B 14305
 
2.4%
G 12230
 
2.0%
E 11823
 
2.0%
Other values (10) 49514
 
8.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 6149182
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 944314
15.4%
i 915778
14.9%
a 664057
10.8%
d 635413
10.3%
r 412407
 
6.7%
o 362011
 
5.9%
t 276360
 
4.5%
l 229210
 
3.7%
c 221440
 
3.6%
u 159982
 
2.6%
Other values (32) 1328210
21.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6149182
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 944314
15.4%
i 915778
14.9%
a 664057
10.8%
d 635413
10.3%
r 412407
 
6.7%
o 362011
 
5.9%
t 276360
 
4.5%
l 229210
 
3.7%
c 221440
 
3.6%
u 159982
 
2.6%
Other values (32) 1328210
21.6%

genus
Text

Distinct1129
Distinct (%)0.2%
Missing1999
Missing (%)0.3%
Memory size4.6 MiB
2025-01-08T17:54:09.897972image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length8.505269813
Min length2

Characters and Unicode

Total characters5098501
Distinct characters50
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)< 0.1%

Sample

1st rowPotos
2nd rowMicrotus
3rd rowCarollia
4th rowPeromyscus
5th rowTursiops
ValueCountFrequency (%)
peromyscus 38753
 
6.5%
microtus 19877
 
3.3%
rattus 16463
 
2.7%
sorex 15826
 
2.6%
artibeus 12467
 
2.1%
carollia 12281
 
2.0%
tursiops 11894
 
2.0%
tamias 11871
 
2.0%
mastomys 11447
 
1.9%
mus 10554
 
1.8%
Other values (1119) 438019
73.1%
2025-01-08T17:54:10.136583image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 603875
 
11.8%
o 509579
 
10.0%
a 348715
 
6.8%
r 348305
 
6.8%
u 337607
 
6.6%
i 326618
 
6.4%
e 313769
 
6.2%
t 248513
 
4.9%
l 221347
 
4.3%
y 216014
 
4.2%
Other values (40) 1624159
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4499049
88.2%
Uppercase Letter 599452
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 603875
13.4%
o 509579
11.3%
a 348715
 
7.8%
r 348305
 
7.7%
u 337607
 
7.5%
i 326618
 
7.3%
e 313769
 
7.0%
t 248513
 
5.5%
l 221347
 
4.9%
y 216014
 
4.8%
Other values (16) 1024707
22.8%
Uppercase Letter
ValueCountFrequency (%)
M 103829
17.3%
P 84771
14.1%
C 58032
9.7%
S 54440
9.1%
T 51640
8.6%
A 33284
 
5.6%
R 31059
 
5.2%
G 28169
 
4.7%
L 22997
 
3.8%
D 21683
 
3.6%
Other values (14) 109548
18.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 5098501
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 603875
 
11.8%
o 509579
 
10.0%
a 348715
 
6.8%
r 348305
 
6.8%
u 337607
 
6.6%
i 326618
 
6.4%
e 313769
 
6.2%
t 248513
 
4.9%
l 221347
 
4.3%
y 216014
 
4.2%
Other values (40) 1624159
31.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5098501
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 603875
 
11.8%
o 509579
 
10.0%
a 348715
 
6.8%
r 348305
 
6.8%
u 337607
 
6.6%
i 326618
 
6.4%
e 313769
 
6.2%
t 248513
 
4.9%
l 221347
 
4.3%
y 216014
 
4.2%
Other values (40) 1624159
31.9%
Distinct1115
Distinct (%)0.2%
Missing2002
Missing (%)0.3%
Memory size4.6 MiB
2025-01-08T17:54:10.319198image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length8.504708491
Min length2

Characters and Unicode

Total characters5098139
Distinct characters50
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)< 0.1%

Sample

1st rowPotos
2nd rowMicrotus
3rd rowCarollia
4th rowPeromyscus
5th rowTursiops
ValueCountFrequency (%)
peromyscus 38753
 
6.5%
microtus 19877
 
3.3%
rattus 16463
 
2.7%
sorex 15826
 
2.6%
artibeus 12470
 
2.1%
carollia 12281
 
2.0%
tursiops 11894
 
2.0%
tamias 11871
 
2.0%
mastomys 11447
 
1.9%
mus 10554
 
1.8%
Other values (1105) 438013
73.1%
2025-01-08T17:54:10.565729image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 603400
 
11.8%
o 512945
 
10.1%
r 348628
 
6.8%
a 347937
 
6.8%
u 335989
 
6.6%
i 330068
 
6.5%
e 312892
 
6.1%
t 245783
 
4.8%
l 219644
 
4.3%
m 215952
 
4.2%
Other values (40) 1624901
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4498690
88.2%
Uppercase Letter 599449
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 603400
13.4%
o 512945
11.4%
r 348628
 
7.7%
a 347937
 
7.7%
u 335989
 
7.5%
i 330068
 
7.3%
e 312892
 
7.0%
t 245783
 
5.5%
l 219644
 
4.9%
m 215952
 
4.8%
Other values (16) 1025452
22.8%
Uppercase Letter
ValueCountFrequency (%)
M 102791
17.1%
P 84544
14.1%
C 58079
9.7%
S 54592
9.1%
T 51641
8.6%
A 32571
 
5.4%
R 31084
 
5.2%
G 28179
 
4.7%
L 23170
 
3.9%
N 23069
 
3.8%
Other values (14) 109729
18.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 5098139
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 603400
 
11.8%
o 512945
 
10.1%
r 348628
 
6.8%
a 347937
 
6.8%
u 335989
 
6.6%
i 330068
 
6.5%
e 312892
 
6.1%
t 245783
 
4.8%
l 219644
 
4.3%
m 215952
 
4.2%
Other values (40) 1624901
31.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5098139
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 603400
 
11.8%
o 512945
 
10.1%
r 348628
 
6.8%
a 347937
 
6.8%
u 335989
 
6.6%
i 330068
 
6.5%
e 312892
 
6.1%
t 245783
 
4.8%
l 219644
 
4.3%
m 215952
 
4.2%
Other values (40) 1624901
31.9%

specificEpithet
Text

Missing 

Distinct2771
Distinct (%)0.5%
Missing29657
Missing (%)4.9%
Memory size4.6 MiB
2025-01-08T17:54:10.753143image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length14
Mean length8.673424345
Min length2

Characters and Unicode

Total characters4959412
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique258 ?
Unique (%)< 0.1%

Sample

1st rowflavus
2nd rowlongicaudus
3rd rowbrevicaudum
4th rowmexicanus
5th rowtruncatus
ValueCountFrequency (%)
maniculatus 15647
 
2.7%
truncatus 11873
 
2.1%
musculus 8519
 
1.5%
perspicillata 8339
 
1.5%
leucopus 7382
 
1.3%
pennsylvanicus 6799
 
1.2%
jamaicensis 5581
 
1.0%
brevicauda 5546
 
1.0%
rattus 5466
 
1.0%
cinereus 4761
 
0.8%
Other values (2761) 491881
86.0%
2025-01-08T17:54:11.011891image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 572805
11.5%
i 552816
11.1%
a 502230
10.1%
u 462190
9.3%
e 328983
 
6.6%
r 327562
 
6.6%
n 325551
 
6.6%
l 286376
 
5.8%
t 270290
 
5.5%
c 258935
 
5.2%
Other values (16) 1071674
21.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4959412
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 572805
11.5%
i 552816
11.1%
a 502230
10.1%
u 462190
9.3%
e 328983
 
6.6%
r 327562
 
6.6%
n 325551
 
6.6%
l 286376
 
5.8%
t 270290
 
5.5%
c 258935
 
5.2%
Other values (16) 1071674
21.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 4959412
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 572805
11.5%
i 552816
11.1%
a 502230
10.1%
u 462190
9.3%
e 328983
 
6.6%
r 327562
 
6.6%
n 325551
 
6.6%
l 286376
 
5.8%
t 270290
 
5.5%
c 258935
 
5.2%
Other values (16) 1071674
21.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4959412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 572805
11.5%
i 552816
11.1%
a 502230
10.1%
u 462190
9.3%
e 328983
 
6.6%
r 327562
 
6.6%
n 325551
 
6.6%
l 286376
 
5.8%
t 270290
 
5.5%
c 258935
 
5.2%
Other values (16) 1071674
21.6%

infraspecificEpithet
Text

Missing 

Distinct2443
Distinct (%)1.1%
Missing386527
Missing (%)64.3%
Memory size4.6 MiB
2025-01-08T17:54:11.199237image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length15
Mean length8.768327409
Min length3

Characters and Unicode

Total characters1884524
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique226 ?
Unique (%)0.1%

Sample

1st rowlongicaudus
2nd rowtotontepecus
3rd rowmarinensis
4th rowbairdii
5th rowmerriami
ValueCountFrequency (%)
domesticus 4357
 
2.0%
pennsylvanicus 4127
 
1.9%
talpoides 3712
 
1.7%
cinereus 3602
 
1.7%
trowbridgii 2145
 
1.0%
merriami 2051
 
1.0%
lestes 1946
 
0.9%
panamensis 1556
 
0.7%
personatus 1522
 
0.7%
mexicana 1479
 
0.7%
Other values (2433) 188427
87.7%
2025-01-08T17:54:11.444524image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 232184
12.3%
i 220508
11.7%
a 170831
9.1%
e 153788
 
8.2%
n 142641
 
7.6%
u 141648
 
7.5%
r 121518
 
6.4%
o 101734
 
5.4%
l 100611
 
5.3%
c 89089
 
4.7%
Other values (16) 409972
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1884524
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 232184
12.3%
i 220508
11.7%
a 170831
9.1%
e 153788
 
8.2%
n 142641
 
7.6%
u 141648
 
7.5%
r 121518
 
6.4%
o 101734
 
5.4%
l 100611
 
5.3%
c 89089
 
4.7%
Other values (16) 409972
21.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 1884524
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 232184
12.3%
i 220508
11.7%
a 170831
9.1%
e 153788
 
8.2%
n 142641
 
7.6%
u 141648
 
7.5%
r 121518
 
6.4%
o 101734
 
5.4%
l 100611
 
5.3%
c 89089
 
4.7%
Other values (16) 409972
21.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1884524
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 232184
12.3%
i 220508
11.7%
a 170831
9.1%
e 153788
 
8.2%
n 142641
 
7.6%
u 141648
 
7.5%
r 121518
 
6.4%
o 101734
 
5.4%
l 100611
 
5.3%
c 89089
 
4.7%
Other values (16) 409972
21.8%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:11.499525image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7.974819229
Min length5

Characters and Unicode

Total characters4796463
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSPECIES
2nd rowSUBSPECIES
3rd rowSPECIES
4th rowSUBSPECIES
5th rowSPECIES
ValueCountFrequency (%)
species 356873
59.3%
subspecies 214924
35.7%
genus 27655
 
4.6%
order 1157
 
0.2%
family 841
 
0.1%
phylum 1
 
< 0.1%
2025-01-08T17:54:11.601119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1386173
28.9%
E 1172406
24.4%
I 572638
11.9%
P 571798
11.9%
C 571797
11.9%
U 242580
 
5.1%
B 214924
 
4.5%
G 27655
 
0.6%
N 27655
 
0.6%
R 2314
 
< 0.1%
Other values (8) 6523
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4796463
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1386173
28.9%
E 1172406
24.4%
I 572638
11.9%
P 571798
11.9%
C 571797
11.9%
U 242580
 
5.1%
B 214924
 
4.5%
G 27655
 
0.6%
N 27655
 
0.6%
R 2314
 
< 0.1%
Other values (8) 6523
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4796463
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1386173
28.9%
E 1172406
24.4%
I 572638
11.9%
P 571798
11.9%
C 571797
11.9%
U 242580
 
5.1%
B 214924
 
4.5%
G 27655
 
0.6%
N 27655
 
0.6%
R 2314
 
< 0.1%
Other values (8) 6523
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4796463
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1386173
28.9%
E 1172406
24.4%
I 572638
11.9%
P 571798
11.9%
C 571797
11.9%
U 242580
 
5.1%
B 214924
 
4.5%
G 27655
 
0.6%
N 27655
 
0.6%
R 2314
 
< 0.1%
Other values (8) 6523
 
0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:11.641441image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.919303484
Min length7

Characters and Unicode

Total characters4763073
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACCEPTED
2nd rowSYNONYM
3rd rowACCEPTED
4th rowSYNONYM
5th rowACCEPTED
ValueCountFrequency (%)
accepted 552476
91.9%
synonym 48535
 
8.1%
doubtful 440
 
0.1%
2025-01-08T17:54:11.732827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1104952
23.2%
E 1104952
23.2%
T 552916
11.6%
D 552916
11.6%
A 552476
11.6%
P 552476
11.6%
Y 97070
 
2.0%
N 97070
 
2.0%
O 48975
 
1.0%
S 48535
 
1.0%
Other values (5) 50735
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4763073
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 1104952
23.2%
E 1104952
23.2%
T 552916
11.6%
D 552916
11.6%
A 552476
11.6%
P 552476
11.6%
Y 97070
 
2.0%
N 97070
 
2.0%
O 48975
 
1.0%
S 48535
 
1.0%
Other values (5) 50735
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4763073
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 1104952
23.2%
E 1104952
23.2%
T 552916
11.6%
D 552916
11.6%
A 552476
11.6%
P 552476
11.6%
Y 97070
 
2.0%
N 97070
 
2.0%
O 48975
 
1.0%
S 48535
 
1.0%
Other values (5) 50735
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4763073
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1104952
23.2%
E 1104952
23.2%
T 552916
11.6%
D 552916
11.6%
A 552476
11.6%
P 552476
11.6%
Y 97070
 
2.0%
N 97070
 
2.0%
O 48975
 
1.0%
S 48535
 
1.0%
Other values (5) 50735
 
1.1%

datasetKey
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:11.786828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters21652236
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 601451
100.0%
2025-01-08T17:54:11.889597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2405804
11.1%
a 2405804
11.1%
- 2405804
11.1%
2 1804353
8.3%
b 1804353
8.3%
4 1804353
8.3%
8 1202902
 
5.6%
3 1202902
 
5.6%
5 1202902
 
5.6%
9 1202902
 
5.6%
Other values (6) 4210157
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10826118
50.0%
Lowercase Letter 8420314
38.9%
Dash Punctuation 2405804
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1804353
16.7%
4 1804353
16.7%
8 1202902
11.1%
3 1202902
11.1%
5 1202902
11.1%
9 1202902
11.1%
1 601451
 
5.6%
7 601451
 
5.6%
0 601451
 
5.6%
6 601451
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 2405804
28.6%
a 2405804
28.6%
b 1804353
21.4%
d 1202902
14.3%
e 601451
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 2405804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13231922
61.1%
Latin 8420314
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 2405804
18.2%
2 1804353
13.6%
4 1804353
13.6%
8 1202902
9.1%
3 1202902
9.1%
5 1202902
9.1%
9 1202902
9.1%
1 601451
 
4.5%
7 601451
 
4.5%
0 601451
 
4.5%
Latin
ValueCountFrequency (%)
c 2405804
28.6%
a 2405804
28.6%
b 1804353
21.4%
d 1202902
14.3%
e 601451
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21652236
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2405804
11.1%
a 2405804
11.1%
- 2405804
11.1%
2 1804353
8.3%
b 1804353
8.3%
4 1804353
8.3%
8 1202902
 
5.6%
3 1202902
 
5.6%
5 1202902
 
5.6%
9 1202902
 
5.6%
Other values (6) 4210157
19.4%

publishingCountry
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:11.928296image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1202902
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 601451
100.0%
2025-01-08T17:54:12.012297image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 601451
50.0%
S 601451
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1202902
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 601451
50.0%
S 601451
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1202902
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 601451
50.0%
S 601451
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1202902
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 601451
50.0%
S 601451
50.0%
Distinct185984
Distinct (%)30.9%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:12.148265image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99573698
Min length20

Characters and Unicode

Total characters14432260
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38937 ?
Unique (%)6.5%

Sample

1st row2024-12-02T13:58:01.255Z
2nd row2024-12-02T13:59:38.442Z
3rd row2024-12-02T13:56:07.605Z
4th row2024-12-02T13:58:24.850Z
5th row2024-12-02T13:56:12.476Z
ValueCountFrequency (%)
2024-12-02t13:57:14.377z 17
 
< 0.1%
2024-12-02t13:57:24.313z 17
 
< 0.1%
2024-12-02t13:57:59.063z 17
 
< 0.1%
2024-12-02t13:57:52.813z 17
 
< 0.1%
2024-12-02t13:57:15.231z 17
 
< 0.1%
2024-12-02t13:57:50.062z 16
 
< 0.1%
2024-12-02t13:57:52.024z 16
 
< 0.1%
2024-12-02t13:57:25.776z 16
 
< 0.1%
2024-12-02t13:56:59.760z 15
 
< 0.1%
2024-12-02t13:57:24.391z 15
 
< 0.1%
Other values (185974) 601288
> 99.9%
2025-01-08T17:54:12.357446image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2746380
19.0%
0 1525337
10.6%
1 1517832
10.5%
- 1202902
8.3%
: 1202902
8.3%
4 967155
 
6.7%
5 955236
 
6.6%
3 952306
 
6.6%
T 601451
 
4.2%
Z 601451
 
4.2%
Other values (5) 2159308
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10222744
70.8%
Other Punctuation 1803712
 
12.5%
Dash Punctuation 1202902
 
8.3%
Uppercase Letter 1202902
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2746380
26.9%
0 1525337
14.9%
1 1517832
14.8%
4 967155
 
9.5%
5 955236
 
9.3%
3 952306
 
9.3%
7 460995
 
4.5%
9 384640
 
3.8%
6 362872
 
3.5%
8 349991
 
3.4%
Other Punctuation
ValueCountFrequency (%)
: 1202902
66.7%
. 600810
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13229358
91.7%
Latin 1202902
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2746380
20.8%
0 1525337
11.5%
1 1517832
11.5%
- 1202902
9.1%
: 1202902
9.1%
4 967155
 
7.3%
5 955236
 
7.2%
3 952306
 
7.2%
. 600810
 
4.5%
7 460995
 
3.5%
Other values (3) 1097503
 
8.3%
Latin
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14432260
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2746380
19.0%
0 1525337
10.6%
1 1517832
10.5%
- 1202902
8.3%
: 1202902
8.3%
4 967155
 
6.7%
5 955236
 
6.6%
3 952306
 
6.6%
T 601451
 
4.2%
Z 601451
 
4.2%
Other values (5) 2159308
15.0%

elevation
Text

Missing 

Distinct1569
Distinct (%)1.5%
Missing496901
Missing (%)82.6%
Memory size4.6 MiB
2025-01-08T17:54:12.556191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.310425634
Min length3

Characters and Unicode

Total characters555205
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique448 ?
Unique (%)0.4%

Sample

1st row1032.0
2nd row1006.0
3rd row545.0
4th row2134.0
5th row130.0
ValueCountFrequency (%)
155.0 2555
 
2.4%
150.0 2080
 
2.0%
975.0 1931
 
1.8%
1829.0 1920
 
1.8%
1219.0 1756
 
1.7%
1524.0 1715
 
1.6%
2438.0 1448
 
1.4%
2134.0 1349
 
1.3%
914.0 1245
 
1.2%
610.0 1175
 
1.1%
Other values (1556) 87376
83.6%
2025-01-08T17:54:12.809281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 157702
28.4%
. 104550
18.8%
1 64998
11.7%
2 42656
 
7.7%
5 41946
 
7.6%
3 29052
 
5.2%
4 25982
 
4.7%
7 23929
 
4.3%
9 22522
 
4.1%
6 21007
 
3.8%
Other values (2) 20861
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 450648
81.2%
Other Punctuation 104550
 
18.8%
Dash Punctuation 7
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 157702
35.0%
1 64998
14.4%
2 42656
 
9.5%
5 41946
 
9.3%
3 29052
 
6.4%
4 25982
 
5.8%
7 23929
 
5.3%
9 22522
 
5.0%
6 21007
 
4.7%
8 20854
 
4.6%
Other Punctuation
ValueCountFrequency (%)
. 104550
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 555205
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 157702
28.4%
. 104550
18.8%
1 64998
11.7%
2 42656
 
7.7%
5 41946
 
7.6%
3 29052
 
5.2%
4 25982
 
4.7%
7 23929
 
4.3%
9 22522
 
4.1%
6 21007
 
3.8%
Other values (2) 20861
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 555205
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 157702
28.4%
. 104550
18.8%
1 64998
11.7%
2 42656
 
7.7%
5 41946
 
7.6%
3 29052
 
5.2%
4 25982
 
4.7%
7 23929
 
4.3%
9 22522
 
4.1%
6 21007
 
3.8%
Other values (2) 20861
 
3.8%

elevationAccuracy
Text

Missing 

Distinct72
Distinct (%)1.9%
Missing597572
Missing (%)99.4%
Memory size4.6 MiB
2025-01-08T17:54:12.902588image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.337200309
Min length3

Characters and Unicode

Total characters16824
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)0.4%

Sample

1st row15.5
2nd row46.0
3rd row30.5
4th row250.0
5th row150.0
ValueCountFrequency (%)
38.0 907
23.4%
150.0 345
 
8.9%
250.0 244
 
6.3%
304.5 236
 
6.1%
120.0 156
 
4.0%
46.0 152
 
3.9%
76.5 149
 
3.8%
100.0 145
 
3.7%
15.0 122
 
3.1%
37.5 108
 
2.8%
Other values (62) 1315
33.9%
2025-01-08T17:54:13.037676image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4442
26.4%
. 3879
23.1%
5 2371
14.1%
3 1535
 
9.1%
1 1270
 
7.5%
8 1062
 
6.3%
2 857
 
5.1%
4 589
 
3.5%
7 404
 
2.4%
6 395
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12945
76.9%
Other Punctuation 3879
 
23.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4442
34.3%
5 2371
18.3%
3 1535
 
11.9%
1 1270
 
9.8%
8 1062
 
8.2%
2 857
 
6.6%
4 589
 
4.6%
7 404
 
3.1%
6 395
 
3.1%
9 20
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 3879
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16824
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4442
26.4%
. 3879
23.1%
5 2371
14.1%
3 1535
 
9.1%
1 1270
 
7.5%
8 1062
 
6.3%
2 857
 
5.1%
4 589
 
3.5%
7 404
 
2.4%
6 395
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16824
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4442
26.4%
. 3879
23.1%
5 2371
14.1%
3 1535
 
9.1%
1 1270
 
7.5%
8 1062
 
6.3%
2 857
 
5.1%
4 589
 
3.5%
7 404
 
2.4%
6 395
 
2.3%

depth
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing601448
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:54:13.080677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.666666667
Min length5

Characters and Unicode

Total characters17
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st row853.0
2nd row1600.0
3rd row1600.0
ValueCountFrequency (%)
1600.0 2
66.7%
853.0 1
33.3%
2025-01-08T17:54:13.169285image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7
41.2%
. 3
17.6%
1 2
 
11.8%
6 2
 
11.8%
8 1
 
5.9%
5 1
 
5.9%
3 1
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
82.4%
Other Punctuation 3
 
17.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7
50.0%
1 2
 
14.3%
6 2
 
14.3%
8 1
 
7.1%
5 1
 
7.1%
3 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7
41.2%
. 3
17.6%
1 2
 
11.8%
6 2
 
11.8%
8 1
 
5.9%
5 1
 
5.9%
3 1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7
41.2%
. 3
17.6%
1 2
 
11.8%
6 2
 
11.8%
8 1
 
5.9%
5 1
 
5.9%
3 1
 
5.9%
Distinct34
Distinct (%)12.5%
Missing601180
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:54:13.237774image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length16.32472325
Min length3

Characters and Unicode

Total characters4424
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)7.4%

Sample

1st row4411.160071289899
2nd row4411.160071289899
3rd row0.0
4th row4411.160071289899
5th row1895.2753464364682
ValueCountFrequency (%)
4411.160071289899 100
36.9%
918.1358064728217 59
21.8%
818.1211019658687 23
 
8.5%
0.0 16
 
5.9%
1698.8813565505823 14
 
5.2%
1895.2753464364682 9
 
3.3%
2501.879815645856 7
 
2.6%
862.8264353705852 5
 
1.8%
1136.4802457515602 5
 
1.8%
3374.3891962124544 4
 
1.5%
Other values (24) 29
 
10.7%
2025-01-08T17:54:13.366267image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 818
18.5%
8 634
14.3%
9 458
10.4%
4 388
8.8%
0 384
8.7%
2 361
8.2%
6 349
7.9%
7 316
 
7.1%
. 271
 
6.1%
5 263
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4153
93.9%
Other Punctuation 271
 
6.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 818
19.7%
8 634
15.3%
9 458
11.0%
4 388
9.3%
0 384
9.2%
2 361
8.7%
6 349
8.4%
7 316
 
7.6%
5 263
 
6.3%
3 182
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 271
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4424
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 818
18.5%
8 634
14.3%
9 458
10.4%
4 388
8.8%
0 384
8.7%
2 361
8.2%
6 349
7.9%
7 316
 
7.1%
. 271
 
6.1%
5 263
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4424
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 818
18.5%
8 634
14.3%
9 458
10.4%
4 388
8.8%
0 384
8.7%
2 361
8.2%
6 349
7.9%
7 316
 
7.1%
. 271
 
6.1%
5 263
 
5.9%

issue
Text

Distinct117
Distinct (%)< 0.1%
Missing13
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:54:13.426075image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length187
Median length48
Mean length62.38084724
Min length48

Characters and Unicode

Total characters37518212
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;TAXON_MATCH_FUZZY
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_DERIVED_FROM_COORDINATES;CONTINENT_INVALID
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count 356968
59.4%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84 110523
 
18.4%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank 68486
 
11.4%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 19137
 
3.2%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;continent_invalid 9426
 
1.6%
occurrence_status_inferred_from_individual_count;taxon_match_fuzzy 7808
 
1.3%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;geodetic_datum_invalid 6304
 
1.0%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;taxon_match_higherrank 3860
 
0.6%
occurrence_status_inferred_from_individual_count;country_derived_from_coordinates;geodetic_datum_assumed_wgs84;continent_invalid 3671
 
0.6%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_invalid 3605
 
0.6%
Other values (107) 11650
 
1.9%
2025-01-08T17:54:13.558166image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 3814538
10.2%
R 3273760
 
8.7%
E 3131386
 
8.3%
I 2892193
 
7.7%
N 2889124
 
7.7%
C 2776336
 
7.4%
U 2757004
 
7.3%
T 2507326
 
6.7%
D 2429693
 
6.5%
O 2238112
 
6.0%
Other values (18) 8808740
23.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 33066744
88.1%
Connector Punctuation 3814538
 
10.2%
Other Punctuation 329862
 
0.9%
Decimal Number 307068
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 3273760
9.9%
E 3131386
9.5%
I 2892193
8.7%
N 2889124
8.7%
C 2776336
8.4%
U 2757004
8.3%
T 2507326
 
7.6%
D 2429693
 
7.3%
O 2238112
 
6.8%
A 1842556
 
5.6%
Other values (14) 6329254
19.1%
Decimal Number
ValueCountFrequency (%)
8 153534
50.0%
4 153534
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3814538
100.0%
Other Punctuation
ValueCountFrequency (%)
; 329862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33066744
88.1%
Common 4451468
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 3273760
9.9%
E 3131386
9.5%
I 2892193
8.7%
N 2889124
8.7%
C 2776336
8.4%
U 2757004
8.3%
T 2507326
 
7.6%
D 2429693
 
7.3%
O 2238112
 
6.8%
A 1842556
 
5.6%
Other values (14) 6329254
19.1%
Common
ValueCountFrequency (%)
_ 3814538
85.7%
; 329862
 
7.4%
8 153534
 
3.4%
4 153534
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37518212
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 3814538
10.2%
R 3273760
 
8.7%
E 3131386
 
8.3%
I 2892193
 
7.7%
N 2889124
 
7.7%
C 2776336
 
7.4%
U 2757004
 
7.3%
T 2507326
 
6.7%
D 2429693
 
6.5%
O 2238112
 
6.0%
Other values (18) 8808740
23.5%

mediaType
Text

Missing 

Distinct55
Distinct (%)< 0.1%
Missing45831
Missing (%)7.6%
Memory size4.6 MiB
2025-01-08T17:54:13.608165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1385
Median length10
Mean length11.66078975
Min length10

Characters and Unicode

Total characters6478968
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)< 0.1%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 509092
91.6%
stillimage;stillimage 38794
 
7.0%
stillimage;stillimage;stillimage 2854
 
0.5%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 1339
 
0.2%
stillimage;stillimage;stillimage;stillimage 1231
 
0.2%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 614
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage 321
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 256
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 250
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 170
 
< 0.1%
Other values (45) 699
 
0.1%
2025-01-08T17:54:13.733951image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 1279016
19.7%
S 639508
9.9%
t 639508
9.9%
i 639508
9.9%
I 639508
9.9%
m 639508
9.9%
a 639508
9.9%
g 639508
9.9%
e 639508
9.9%
; 83888
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5116064
79.0%
Uppercase Letter 1279016
 
19.7%
Other Punctuation 83888
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1279016
25.0%
t 639508
12.5%
i 639508
12.5%
m 639508
12.5%
a 639508
12.5%
g 639508
12.5%
e 639508
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 639508
50.0%
I 639508
50.0%
Other Punctuation
ValueCountFrequency (%)
; 83888
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6395080
98.7%
Common 83888
 
1.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 1279016
20.0%
S 639508
10.0%
t 639508
10.0%
i 639508
10.0%
I 639508
10.0%
m 639508
10.0%
a 639508
10.0%
g 639508
10.0%
e 639508
10.0%
Common
ValueCountFrequency (%)
; 83888
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6478968
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 1279016
19.7%
S 639508
9.9%
t 639508
9.9%
i 639508
9.9%
I 639508
9.9%
m 639508
9.9%
a 639508
9.9%
g 639508
9.9%
e 639508
9.9%
; 83888
 
1.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:13.777951image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.744727334
Min length4

Characters and Unicode

Total characters2853721
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowtrue
4th rowfalse
5th rowtrue
ValueCountFrequency (%)
false 447917
74.5%
true 153534
 
25.5%
2025-01-08T17:54:13.866526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 601451
21.1%
f 447917
15.7%
a 447917
15.7%
l 447917
15.7%
s 447917
15.7%
t 153534
 
5.4%
r 153534
 
5.4%
u 153534
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2853721
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 601451
21.1%
f 447917
15.7%
a 447917
15.7%
l 447917
15.7%
s 447917
15.7%
t 153534
 
5.4%
r 153534
 
5.4%
u 153534
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 2853721
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 601451
21.1%
f 447917
15.7%
a 447917
15.7%
l 447917
15.7%
s 447917
15.7%
t 153534
 
5.4%
r 153534
 
5.4%
u 153534
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2853721
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 601451
21.1%
f 447917
15.7%
a 447917
15.7%
l 447917
15.7%
s 447917
15.7%
t 153534
 
5.4%
r 153534
 
5.4%
u 153534
 
5.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:13.907523image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.996536709
Min length4

Characters and Unicode

Total characters3005172
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 599368
99.7%
true 2083
 
0.3%
2025-01-08T17:54:13.999356image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 601451
20.0%
f 599368
19.9%
a 599368
19.9%
l 599368
19.9%
s 599368
19.9%
t 2083
 
0.1%
r 2083
 
0.1%
u 2083
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3005172
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 601451
20.0%
f 599368
19.9%
a 599368
19.9%
l 599368
19.9%
s 599368
19.9%
t 2083
 
0.1%
r 2083
 
0.1%
u 2083
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3005172
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 601451
20.0%
f 599368
19.9%
a 599368
19.9%
l 599368
19.9%
s 599368
19.9%
t 2083
 
0.1%
r 2083
 
0.1%
u 2083
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3005172
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 601451
20.0%
f 599368
19.9%
a 599368
19.9%
l 599368
19.9%
s 599368
19.9%
t 2083
 
0.1%
r 2083
 
0.1%
u 2083
 
0.1%
Distinct7326
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:14.289887image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.991821445
Min length2

Characters and Unicode

Total characters4205238
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique849 ?
Unique (%)0.1%

Sample

1st row2433573
2nd row6163544
3rd row2433177
4th row9119004
5th row2440447
ValueCountFrequency (%)
2437967 13357
 
2.2%
2440447 11847
 
2.0%
2438904 8874
 
1.5%
2433176 8329
 
1.4%
2438019 7116
 
1.2%
2433272 5470
 
0.9%
2439270 5412
 
0.9%
2437782 5206
 
0.9%
4264939 4687
 
0.8%
5706760 4437
 
0.7%
Other values (7316) 526716
87.6%
2025-01-08T17:54:14.547366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 686795
16.3%
4 677503
16.1%
3 521933
12.4%
6 471038
11.2%
7 393586
9.4%
1 320627
7.6%
8 310437
7.4%
9 301609
7.2%
5 266002
 
6.3%
0 255708
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4205238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 686795
16.3%
4 677503
16.1%
3 521933
12.4%
6 471038
11.2%
7 393586
9.4%
1 320627
7.6%
8 310437
7.4%
9 301609
7.2%
5 266002
 
6.3%
0 255708
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 4205238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 686795
16.3%
4 677503
16.1%
3 521933
12.4%
6 471038
11.2%
7 393586
9.4%
1 320627
7.6%
8 310437
7.4%
9 301609
7.2%
5 266002
 
6.3%
0 255708
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4205238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 686795
16.3%
4 677503
16.1%
3 521933
12.4%
6 471038
11.2%
7 393586
9.4%
1 320627
7.6%
8 310437
7.4%
9 301609
7.2%
5 266002
 
6.3%
0 255708
 
6.1%
Distinct6815
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:14.750291image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.999645856
Min length2

Characters and Unicode

Total characters4209944
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique793 ?
Unique (%)0.1%

Sample

1st row2433573
2nd row2438621
3rd row2433177
4th row2438034
5th row2440447
ValueCountFrequency (%)
2437967 14724
 
2.4%
2440447 11867
 
2.0%
2438904 8874
 
1.5%
2433176 8329
 
1.4%
2438019 7347
 
1.2%
2438655 6840
 
1.1%
2433272 5470
 
0.9%
2439270 5412
 
0.9%
2437782 5206
 
0.9%
4264939 4687
 
0.8%
Other values (6805) 522695
86.9%
2025-01-08T17:54:15.011315image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4209944
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 4209944
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4209944
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 695586
16.5%
4 682390
16.2%
3 540567
12.8%
6 455861
10.8%
7 389135
9.2%
1 323452
7.7%
8 317316
7.5%
9 293353
7.0%
5 267471
 
6.4%
0 244813
 
5.8%

kingdomKey
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:15.066040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters601451
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 601451
100.0%
2025-01-08T17:54:15.153879image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 601451
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 601451
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 601451
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 601451
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 601451
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 601451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 601451
100.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:15.193523image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1202902
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44
2nd row44
3rd row44
4th row44
5th row44
ValueCountFrequency (%)
44 601449
> 99.9%
52 2
 
< 0.1%
2025-01-08T17:54:15.281489image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 1202898
> 99.9%
5 2
 
< 0.1%
2 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1202902
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 1202898
> 99.9%
5 2
 
< 0.1%
2 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1202902
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 1202898
> 99.9%
5 2
 
< 0.1%
2 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1202902
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 1202898
> 99.9%
5 2
 
< 0.1%
2 2
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:54:15.318712image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1804350
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row359
2nd row359
3rd row359
4th row359
5th row359
ValueCountFrequency (%)
359 601448
> 99.9%
225 2
 
< 0.1%
2025-01-08T17:54:15.405185image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 601450
33.3%
3 601448
33.3%
9 601448
33.3%
2 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1804350
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 601450
33.3%
3 601448
33.3%
9 601448
33.3%
2 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1804350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 601450
33.3%
3 601448
33.3%
9 601448
33.3%
2 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1804350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 601450
33.3%
3 601448
33.3%
9 601448
33.3%
2 4
 
< 0.1%
Distinct29
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:54:15.456581image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.501132267
Min length3

Characters and Unicode

Total characters2105749
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row732
2nd row1459
3rd row734
4th row1459
5th row733
ValueCountFrequency (%)
1459 297636
49.5%
734 129084
21.5%
733 47588
 
7.9%
732 47294
 
7.9%
803 30383
 
5.1%
785 11977
 
2.0%
731 11375
 
1.9%
798 10781
 
1.8%
783 5645
 
0.9%
1452 1652
 
0.3%
Other values (19) 8033
 
1.3%
2025-01-08T17:54:15.564060image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 431834
20.5%
3 321250
15.3%
1 314421
14.9%
5 313551
14.9%
9 311268
14.8%
7 266248
12.6%
8 63351
 
3.0%
2 51029
 
2.4%
0 32324
 
1.5%
6 473
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2105749
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 431834
20.5%
3 321250
15.3%
1 314421
14.9%
5 313551
14.9%
9 311268
14.8%
7 266248
12.6%
8 63351
 
3.0%
2 51029
 
2.4%
0 32324
 
1.5%
6 473
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 2105749
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 431834
20.5%
3 321250
15.3%
1 314421
14.9%
5 313551
14.9%
9 311268
14.8%
7 266248
12.6%
8 63351
 
3.0%
2 51029
 
2.4%
0 32324
 
1.5%
6 473
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2105749
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 431834
20.5%
3 321250
15.3%
1 314421
14.9%
5 313551
14.9%
9 311268
14.8%
7 266248
12.6%
8 63351
 
3.0%
2 51029
 
2.4%
0 32324
 
1.5%
6 473
 
< 0.1%
Distinct158
Distinct (%)< 0.1%
Missing1158
Missing (%)0.2%
Memory size4.6 MiB
2025-01-08T17:54:15.686129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.622670929
Min length4

Characters and Unicode

Total characters2774957
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row5311
2nd row3240723
3rd row9366
4th row3240723
5th row5314
ValueCountFrequency (%)
3240723 107243
17.9%
5510 93911
15.6%
9366 55530
 
9.3%
9456 46130
 
7.7%
5534 27470
 
4.6%
5314 23642
 
3.9%
9368 22260
 
3.7%
5504 19997
 
3.3%
5719 13560
 
2.3%
9701 12559
 
2.1%
Other values (148) 177991
29.7%
2025-01-08T17:54:15.882208image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 475202
17.1%
3 474928
17.1%
4 299816
10.8%
9 285815
10.3%
0 267438
9.6%
6 258430
9.3%
2 257026
9.3%
7 205986
7.4%
1 200321
7.2%
8 49995
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2774957
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 475202
17.1%
3 474928
17.1%
4 299816
10.8%
9 285815
10.3%
0 267438
9.6%
6 258430
9.3%
2 257026
9.3%
7 205986
7.4%
1 200321
7.2%
8 49995
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 2774957
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 475202
17.1%
3 474928
17.1%
4 299816
10.8%
9 285815
10.3%
0 267438
9.6%
6 258430
9.3%
2 257026
9.3%
7 205986
7.4%
1 200321
7.2%
8 49995
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2774957
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 475202
17.1%
3 474928
17.1%
4 299816
10.8%
9 285815
10.3%
0 267438
9.6%
6 258430
9.3%
2 257026
9.3%
7 205986
7.4%
1 200321
7.2%
8 49995
 
1.8%
Distinct1129
Distinct (%)0.2%
Missing1999
Missing (%)0.3%
Memory size4.6 MiB
2025-01-08T17:54:16.070962image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.000950868
Min length7

Characters and Unicode

Total characters4196734
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)< 0.1%

Sample

1st row2433572
2nd row2438591
3rd row2433174
4th row2437961
5th row2440446
ValueCountFrequency (%)
2437961 38753
 
6.5%
2438591 19877
 
3.3%
2439223 16463
 
2.7%
2435935 15826
 
2.6%
2433258 12467
 
2.1%
2433174 12281
 
2.0%
2440446 11894
 
2.0%
2437422 11871
 
2.0%
2438904 11447
 
1.9%
9800657 10554
 
1.8%
Other values (1119) 438019
73.1%
2025-01-08T17:54:16.315530image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 839496
20.0%
4 830509
19.8%
3 764513
18.2%
9 329906
 
7.9%
8 279872
 
6.7%
7 255740
 
6.1%
5 254775
 
6.1%
6 229246
 
5.5%
1 221313
 
5.3%
0 191364
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4196734
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 839496
20.0%
4 830509
19.8%
3 764513
18.2%
9 329906
 
7.9%
8 279872
 
6.7%
7 255740
 
6.1%
5 254775
 
6.1%
6 229246
 
5.5%
1 221313
 
5.3%
0 191364
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4196734
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 839496
20.0%
4 830509
19.8%
3 764513
18.2%
9 329906
 
7.9%
8 279872
 
6.7%
7 255740
 
6.1%
5 254775
 
6.1%
6 229246
 
5.5%
1 221313
 
5.3%
0 191364
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4196734
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 839496
20.0%
4 830509
19.8%
3 764513
18.2%
9 329906
 
7.9%
8 279872
 
6.7%
7 255740
 
6.1%
5 254775
 
6.1%
6 229246
 
5.5%
1 221313
 
5.3%
0 191364
 
4.6%

speciesKey
Text

Missing 

Distinct3897
Distinct (%)0.7%
Missing29663
Missing (%)4.9%
Memory size4.6 MiB
2025-01-08T17:54:16.512369image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.006916899
Min length7

Characters and Unicode

Total characters4006471
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique406 ?
Unique (%)0.1%

Sample

1st row2433573
2nd row2438621
3rd row2433177
4th row2438034
5th row2440447
ValueCountFrequency (%)
2437967 15647
 
2.7%
2440447 11869
 
2.1%
2433176 8329
 
1.5%
2438019 7347
 
1.3%
2438655 6840
 
1.2%
7429082 6399
 
1.1%
2433272 5482
 
1.0%
2439270 5412
 
0.9%
5219153 4558
 
0.8%
5706760 4437
 
0.8%
Other values (3887) 495468
86.7%
2025-01-08T17:54:16.760502image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 731613
18.3%
4 679263
17.0%
3 607462
15.2%
7 337688
8.4%
9 302340
7.5%
8 300825
7.5%
6 290793
 
7.3%
5 286808
 
7.2%
1 253470
 
6.3%
0 216209
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4006471
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 731613
18.3%
4 679263
17.0%
3 607462
15.2%
7 337688
8.4%
9 302340
7.5%
8 300825
7.5%
6 290793
 
7.3%
5 286808
 
7.2%
1 253470
 
6.3%
0 216209
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 4006471
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 731613
18.3%
4 679263
17.0%
3 607462
15.2%
7 337688
8.4%
9 302340
7.5%
8 300825
7.5%
6 290793
 
7.3%
5 286808
 
7.2%
1 253470
 
6.3%
0 216209
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4006471
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 731613
18.3%
4 679263
17.0%
3 607462
15.2%
7 337688
8.4%
9 302340
7.5%
8 300825
7.5%
6 290793
 
7.3%
5 286808
 
7.2%
1 253470
 
6.3%
0 216209
 
5.4%

species
Text

Missing 

Distinct3897
Distinct (%)0.7%
Missing29663
Missing (%)4.9%
Memory size4.6 MiB
2025-01-08T17:54:16.935127image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length26
Mean length18.14441541
Min length8

Characters and Unicode

Total characters10374759
Distinct characters51
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique406 ?
Unique (%)0.1%

Sample

1st rowPotos flavus
2nd rowMicrotus longicaudus
3rd rowCarollia brevicaudum
4th rowPeromyscus mexicanus
5th rowTursiops truncatus
ValueCountFrequency (%)
peromyscus 38710
 
3.4%
rattus 21793
 
1.9%
microtus 19863
 
1.7%
sorex 15805
 
1.4%
maniculatus 15647
 
1.4%
artibeus 12162
 
1.1%
tursiops 11892
 
1.0%
truncatus 11873
 
1.0%
tamias 11870
 
1.0%
carollia 11315
 
1.0%
Other values (3847) 972646
85.1%
2025-01-08T17:54:17.161730image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 1138591
 
11.0%
i 850505
 
8.2%
a 832841
 
8.0%
u 789136
 
7.6%
o 733517
 
7.1%
r 656869
 
6.3%
e 630821
 
6.1%
571788
 
5.5%
t 508552
 
4.9%
l 479658
 
4.6%
Other values (41) 3182481
30.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9231183
89.0%
Space Separator 571788
 
5.5%
Uppercase Letter 571788
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1138591
12.3%
i 850505
 
9.2%
a 832841
 
9.0%
u 789136
 
8.5%
o 733517
 
7.9%
r 656869
 
7.1%
e 630821
 
6.8%
t 508552
 
5.5%
l 479658
 
5.2%
n 466895
 
5.1%
Other values (16) 2143798
23.2%
Uppercase Letter
ValueCountFrequency (%)
M 93612
16.4%
P 83184
14.5%
C 55886
9.8%
S 54073
9.5%
T 51158
8.9%
A 32606
 
5.7%
R 30807
 
5.4%
L 22099
 
3.9%
D 21617
 
3.8%
N 20451
 
3.6%
Other values (14) 106295
18.6%
Space Separator
ValueCountFrequency (%)
571788
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9802971
94.5%
Common 571788
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1138591
11.6%
i 850505
 
8.7%
a 832841
 
8.5%
u 789136
 
8.0%
o 733517
 
7.5%
r 656869
 
6.7%
e 630821
 
6.4%
t 508552
 
5.2%
l 479658
 
4.9%
n 466895
 
4.8%
Other values (40) 2715586
27.7%
Common
ValueCountFrequency (%)
571788
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10374759
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1138591
 
11.0%
i 850505
 
8.2%
a 832841
 
8.0%
u 789136
 
7.6%
o 733517
 
7.1%
r 656869
 
6.3%
e 630821
 
6.1%
571788
 
5.5%
t 508552
 
4.9%
l 479658
 
4.6%
Other values (41) 3182481
30.7%
Distinct6815
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:17.347175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length147
Median length76
Mean length34.65024416
Min length7

Characters and Unicode

Total characters20840424
Distinct characters80
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique793 ?
Unique (%)0.1%

Sample

1st rowPotos flavus (Schreber, 1774)
2nd rowMicrotus longicaudus (Merriam, 1888)
3rd rowCarollia brevicaudum (Schinz, 1821)
4th rowPeromyscus mexicanus (Saussure, 1860)
5th rowTursiops truncatus (Montagu, 1821)
ValueCountFrequency (%)
linnaeus 54248
 
2.2%
1758 48971
 
2.0%
thomas 43832
 
1.8%
peromyscus 38753
 
1.6%
merriam 31855
 
1.3%
24205
 
1.0%
1821 22026
 
0.9%
rattus 21929
 
0.9%
wagner 21848
 
0.9%
j.a.allen 20548
 
0.8%
Other values (6260) 2160706
86.8%
2025-01-08T17:54:17.606833image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1887470
 
9.1%
s 1613579
 
7.7%
a 1375849
 
6.6%
i 1277042
 
6.1%
e 1193882
 
5.7%
r 1077033
 
5.2%
u 1043189
 
5.0%
o 1026603
 
4.9%
n 888434
 
4.3%
l 794841
 
3.8%
Other values (70) 8662502
41.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14209920
68.2%
Decimal Number 2149776
 
10.3%
Space Separator 1887470
 
9.1%
Uppercase Letter 1266302
 
6.1%
Other Punctuation 650408
 
3.1%
Open Punctuation 334773
 
1.6%
Close Punctuation 334773
 
1.6%
Dash Punctuation 7002
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1613579
11.4%
a 1375849
9.7%
i 1277042
 
9.0%
e 1193882
 
8.4%
r 1077033
 
7.6%
u 1043189
 
7.3%
o 1026603
 
7.2%
n 888434
 
6.3%
l 794841
 
5.6%
t 695386
 
4.9%
Other values (24) 3224082
22.7%
Uppercase Letter
ValueCountFrequency (%)
M 177257
14.0%
P 122846
9.7%
T 110795
 
8.7%
S 106664
 
8.4%
A 99376
 
7.8%
L 97931
 
7.7%
G 78980
 
6.2%
C 74570
 
5.9%
B 54633
 
4.3%
R 50431
 
4.0%
Other values (18) 292819
23.1%
Decimal Number
ValueCountFrequency (%)
1 658265
30.6%
8 464785
21.6%
9 213267
 
9.9%
7 166016
 
7.7%
5 145419
 
6.8%
0 128064
 
6.0%
4 104493
 
4.9%
3 97629
 
4.5%
2 90691
 
4.2%
6 81147
 
3.8%
Other Punctuation
ValueCountFrequency (%)
, 539214
82.9%
. 86211
 
13.3%
& 24205
 
3.7%
' 778
 
0.1%
Space Separator
ValueCountFrequency (%)
1887470
100.0%
Open Punctuation
ValueCountFrequency (%)
( 334773
100.0%
Close Punctuation
ValueCountFrequency (%)
) 334773
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7002
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15476222
74.3%
Common 5364202
 
25.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1613579
 
10.4%
a 1375849
 
8.9%
i 1277042
 
8.3%
e 1193882
 
7.7%
r 1077033
 
7.0%
u 1043189
 
6.7%
o 1026603
 
6.6%
n 888434
 
5.7%
l 794841
 
5.1%
t 695386
 
4.5%
Other values (52) 4490384
29.0%
Common
ValueCountFrequency (%)
1887470
35.2%
1 658265
 
12.3%
, 539214
 
10.1%
8 464785
 
8.7%
( 334773
 
6.2%
) 334773
 
6.2%
9 213267
 
4.0%
7 166016
 
3.1%
5 145419
 
2.7%
0 128064
 
2.4%
Other values (8) 492156
 
9.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20827466
99.9%
None 12958
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1887470
 
9.1%
s 1613579
 
7.7%
a 1375849
 
6.6%
i 1277042
 
6.1%
e 1193882
 
5.7%
r 1077033
 
5.2%
u 1043189
 
5.0%
o 1026603
 
4.9%
n 888434
 
4.3%
l 794841
 
3.8%
Other values (60) 8649544
41.5%
None
ValueCountFrequency (%)
ü 5162
39.8%
É 4278
33.0%
é 1649
 
12.7%
è 1421
 
11.0%
ö 310
 
2.4%
í 70
 
0.5%
Ä 24
 
0.2%
ä 24
 
0.2%
á 19
 
0.1%
ñ 1
 
< 0.1%
Distinct7805
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:17.760934image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length43
Mean length22.61255364
Min length5

Characters and Unicode

Total characters13600343
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique898 ?
Unique (%)0.1%

Sample

1st rowPotos flavus
2nd rowMicrotus longicaudus longicaudus
3rd rowCarollia brevicauda
4th rowPeromyscus mexicanus totontepecus
5th rowTursiops truncatus
ValueCountFrequency (%)
peromyscus 38753
 
2.6%
sp 28343
 
1.9%
rattus 21929
 
1.5%
microtus 19877
 
1.3%
maniculatus 15880
 
1.1%
sorex 15831
 
1.1%
artibeus 12470
 
0.8%
carollia 12281
 
0.8%
tursiops 11895
 
0.8%
truncatus 11875
 
0.8%
Other values (5505) 1302266
87.3%
2025-01-08T17:54:17.977047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 1517215
 
11.2%
i 1187099
 
8.7%
a 1082276
 
8.0%
u 980723
 
7.2%
o 902387
 
6.6%
889949
 
6.5%
e 862255
 
6.3%
r 848292
 
6.2%
n 665623
 
4.9%
l 634731
 
4.7%
Other values (53) 4029793
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12079597
88.8%
Space Separator 889949
 
6.5%
Uppercase Letter 601771
 
4.4%
Other Punctuation 28356
 
0.2%
Open Punctuation 313
 
< 0.1%
Close Punctuation 313
 
< 0.1%
Decimal Number 44
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1517215
12.6%
i 1187099
9.8%
a 1082276
 
9.0%
u 980723
 
8.1%
o 902387
 
7.5%
e 862255
 
7.1%
r 848292
 
7.0%
n 665623
 
5.5%
l 634731
 
5.3%
t 618435
 
5.1%
Other values (16) 2780561
23.0%
Uppercase Letter
ValueCountFrequency (%)
M 103156
17.1%
P 84557
14.1%
C 58907
9.8%
S 54594
9.1%
T 51645
8.6%
A 32571
 
5.4%
R 31119
 
5.2%
G 28180
 
4.7%
L 23175
 
3.9%
N 23069
 
3.8%
Other values (14) 110798
18.4%
Decimal Number
ValueCountFrequency (%)
8 13
29.5%
1 12
27.3%
2 7
15.9%
9 6
13.6%
5 3
 
6.8%
0 2
 
4.5%
4 1
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 28343
> 99.9%
, 11
 
< 0.1%
/ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
889949
100.0%
Open Punctuation
ValueCountFrequency (%)
( 313
100.0%
Close Punctuation
ValueCountFrequency (%)
) 313
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12681368
93.2%
Common 918975
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1517215
12.0%
i 1187099
 
9.4%
a 1082276
 
8.5%
u 980723
 
7.7%
o 902387
 
7.1%
e 862255
 
6.8%
r 848292
 
6.7%
n 665623
 
5.2%
l 634731
 
5.0%
t 618435
 
4.9%
Other values (40) 3382332
26.7%
Common
ValueCountFrequency (%)
889949
96.8%
. 28343
 
3.1%
( 313
 
< 0.1%
) 313
 
< 0.1%
8 13
 
< 0.1%
1 12
 
< 0.1%
, 11
 
< 0.1%
2 7
 
< 0.1%
9 6
 
< 0.1%
5 3
 
< 0.1%
Other values (3) 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13600343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1517215
 
11.2%
i 1187099
 
8.7%
a 1082276
 
8.0%
u 980723
 
7.2%
o 902387
 
6.6%
889949
 
6.5%
e 862255
 
6.3%
r 848292
 
6.2%
n 665623
 
4.9%
l 634731
 
4.7%
Other values (53) 4029793
29.6%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:18.025916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1804353
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 601451
100.0%
2025-01-08T17:54:18.111916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 601451
33.3%
M 601451
33.3%
L 601451
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1804353
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 601451
33.3%
M 601451
33.3%
L 601451
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1804353
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 601451
33.3%
M 601451
33.3%
L 601451
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1804353
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 601451
33.3%
M 601451
33.3%
L 601451
33.3%
Distinct185984
Distinct (%)30.9%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:18.243989image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99573698
Min length20

Characters and Unicode

Total characters14432260
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38937 ?
Unique (%)6.5%

Sample

1st row2024-12-02T13:58:01.255Z
2nd row2024-12-02T13:59:38.442Z
3rd row2024-12-02T13:56:07.605Z
4th row2024-12-02T13:58:24.850Z
5th row2024-12-02T13:56:12.476Z
ValueCountFrequency (%)
2024-12-02t13:57:14.377z 17
 
< 0.1%
2024-12-02t13:57:24.313z 17
 
< 0.1%
2024-12-02t13:57:59.063z 17
 
< 0.1%
2024-12-02t13:57:52.813z 17
 
< 0.1%
2024-12-02t13:57:15.231z 17
 
< 0.1%
2024-12-02t13:57:50.062z 16
 
< 0.1%
2024-12-02t13:57:52.024z 16
 
< 0.1%
2024-12-02t13:57:25.776z 16
 
< 0.1%
2024-12-02t13:56:59.760z 15
 
< 0.1%
2024-12-02t13:57:24.391z 15
 
< 0.1%
Other values (185974) 601288
> 99.9%
2025-01-08T17:54:18.450634image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2746380
19.0%
0 1525337
10.6%
1 1517832
10.5%
- 1202902
8.3%
: 1202902
8.3%
4 967155
 
6.7%
5 955236
 
6.6%
3 952306
 
6.6%
T 601451
 
4.2%
Z 601451
 
4.2%
Other values (5) 2159308
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10222744
70.8%
Other Punctuation 1803712
 
12.5%
Dash Punctuation 1202902
 
8.3%
Uppercase Letter 1202902
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2746380
26.9%
0 1525337
14.9%
1 1517832
14.8%
4 967155
 
9.5%
5 955236
 
9.3%
3 952306
 
9.3%
7 460995
 
4.5%
9 384640
 
3.8%
6 362872
 
3.5%
8 349991
 
3.4%
Other Punctuation
ValueCountFrequency (%)
: 1202902
66.7%
. 600810
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13229358
91.7%
Latin 1202902
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2746380
20.8%
0 1525337
11.5%
1 1517832
11.5%
- 1202902
9.1%
: 1202902
9.1%
4 967155
 
7.3%
5 955236
 
7.2%
3 952306
 
7.2%
. 600810
 
4.5%
7 460995
 
3.5%
Other values (3) 1097503
 
8.3%
Latin
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14432260
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2746380
19.0%
0 1525337
10.6%
1 1517832
10.5%
- 1202902
8.3%
: 1202902
8.3%
4 967155
 
6.7%
5 955236
 
6.6%
3 952306
 
6.6%
T 601451
 
4.2%
Z 601451
 
4.2%
Other values (5) 2159308
15.0%

lastCrawled
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:18.509633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters14434824
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-12-02T11:48:23.416Z
2nd row2024-12-02T11:48:23.416Z
3rd row2024-12-02T11:48:23.416Z
4th row2024-12-02T11:48:23.416Z
5th row2024-12-02T11:48:23.416Z
ValueCountFrequency (%)
2024-12-02t11:48:23.416z 601451
100.0%
2025-01-08T17:54:18.609483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3007255
20.8%
1 2405804
16.7%
4 1804353
12.5%
0 1202902
 
8.3%
- 1202902
 
8.3%
: 1202902
 
8.3%
T 601451
 
4.2%
8 601451
 
4.2%
3 601451
 
4.2%
. 601451
 
4.2%
Other values (2) 1202902
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10224667
70.8%
Other Punctuation 1804353
 
12.5%
Dash Punctuation 1202902
 
8.3%
Uppercase Letter 1202902
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3007255
29.4%
1 2405804
23.5%
4 1804353
17.6%
0 1202902
 
11.8%
8 601451
 
5.9%
3 601451
 
5.9%
6 601451
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 1202902
66.7%
. 601451
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13231922
91.7%
Latin 1202902
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3007255
22.7%
1 2405804
18.2%
4 1804353
13.6%
0 1202902
 
9.1%
- 1202902
 
9.1%
: 1202902
 
9.1%
8 601451
 
4.5%
3 601451
 
4.5%
. 601451
 
4.5%
6 601451
 
4.5%
Latin
ValueCountFrequency (%)
T 601451
50.0%
Z 601451
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14434824
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3007255
20.8%
1 2405804
16.7%
4 1804353
12.5%
0 1202902
 
8.3%
- 1202902
 
8.3%
: 1202902
 
8.3%
T 601451
 
4.2%
8 601451
 
4.2%
3 601451
 
4.2%
. 601451
 
4.2%
Other values (2) 1202902
 
8.3%
Distinct2
Distinct (%)< 0.1%
Missing2505
Missing (%)0.4%
Memory size4.6 MiB
2025-01-08T17:54:18.647335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.377813693
Min length4

Characters and Unicode

Total characters2622074
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtrue
2nd rowfalse
3rd rowtrue
4th rowtrue
5th rowfalse
ValueCountFrequency (%)
true 372656
62.2%
false 226290
37.8%
2025-01-08T17:54:18.739940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 598946
22.8%
t 372656
14.2%
r 372656
14.2%
u 372656
14.2%
f 226290
 
8.6%
a 226290
 
8.6%
l 226290
 
8.6%
s 226290
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2622074
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 598946
22.8%
t 372656
14.2%
r 372656
14.2%
u 372656
14.2%
f 226290
 
8.6%
a 226290
 
8.6%
l 226290
 
8.6%
s 226290
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 2622074
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 598946
22.8%
t 372656
14.2%
r 372656
14.2%
u 372656
14.2%
f 226290
 
8.6%
a 226290
 
8.6%
l 226290
 
8.6%
s 226290
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2622074
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 598946
22.8%
t 372656
14.2%
r 372656
14.2%
u 372656
14.2%
f 226290
 
8.6%
a 226290
 
8.6%
l 226290
 
8.6%
s 226290
 
8.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:18.779941image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.998247571
Min length4

Characters and Unicode

Total characters3006201
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 600397
99.8%
true 1054
 
0.2%
2025-01-08T17:54:18.869403image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 601451
20.0%
f 600397
20.0%
a 600397
20.0%
l 600397
20.0%
s 600397
20.0%
t 1054
 
< 0.1%
r 1054
 
< 0.1%
u 1054
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3006201
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 601451
20.0%
f 600397
20.0%
a 600397
20.0%
l 600397
20.0%
s 600397
20.0%
t 1054
 
< 0.1%
r 1054
 
< 0.1%
u 1054
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3006201
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 601451
20.0%
f 600397
20.0%
a 600397
20.0%
l 600397
20.0%
s 600397
20.0%
t 1054
 
< 0.1%
r 1054
 
< 0.1%
u 1054
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3006201
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 601451
20.0%
f 600397
20.0%
a 600397
20.0%
l 600397
20.0%
s 600397
20.0%
t 1054
 
< 0.1%
r 1054
 
< 0.1%
u 1054
 
< 0.1%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing15955
Missing (%)2.7%
Memory size4.6 MiB
2025-01-08T17:54:18.916406image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length10.49816395
Min length4

Characters and Unicode

Total characters6146633
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLATIN_AMERICA
2nd rowNORTH_AMERICA
3rd rowLATIN_AMERICA
4th rowLATIN_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 245840
42.0%
latin_america 145714
24.9%
africa 101325
17.3%
asia 63583
 
10.9%
europe 17807
 
3.0%
oceania 8321
 
1.4%
antarctica 2906
 
0.5%
2025-01-08T17:54:19.012245image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1283998
20.9%
R 759432
12.4%
I 713403
11.6%
C 507012
 
8.2%
E 435489
 
7.1%
N 402781
 
6.6%
T 397366
 
6.5%
_ 391554
 
6.4%
M 391554
 
6.4%
O 271968
 
4.4%
Other values (6) 592076
9.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5755079
93.6%
Connector Punctuation 391554
 
6.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1283998
22.3%
R 759432
13.2%
I 713403
12.4%
C 507012
 
8.8%
E 435489
 
7.6%
N 402781
 
7.0%
T 397366
 
6.9%
M 391554
 
6.8%
O 271968
 
4.7%
H 245840
 
4.3%
Other values (5) 346236
 
6.0%
Connector Punctuation
ValueCountFrequency (%)
_ 391554
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5755079
93.6%
Common 391554
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1283998
22.3%
R 759432
13.2%
I 713403
12.4%
C 507012
 
8.8%
E 435489
 
7.6%
N 402781
 
7.0%
T 397366
 
6.9%
M 391554
 
6.8%
O 271968
 
4.7%
H 245840
 
4.3%
Other values (5) 346236
 
6.0%
Common
ValueCountFrequency (%)
_ 391554
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6146633
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1283998
20.9%
R 759432
12.4%
I 713403
11.6%
C 507012
 
8.2%
E 435489
 
7.1%
N 402781
 
6.6%
T 397366
 
6.5%
_ 391554
 
6.4%
M 391554
 
6.4%
O 271968
 
4.4%
Other values (6) 592076
9.6%

publishedByGbifRegion
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:54:19.054732image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters7818863
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 601451
100.0%
2025-01-08T17:54:19.145095image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 1202902
15.4%
A 1202902
15.4%
N 601451
7.7%
O 601451
7.7%
T 601451
7.7%
H 601451
7.7%
_ 601451
7.7%
M 601451
7.7%
E 601451
7.7%
I 601451
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7217412
92.3%
Connector Punctuation 601451
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 1202902
16.7%
A 1202902
16.7%
N 601451
8.3%
O 601451
8.3%
T 601451
8.3%
H 601451
8.3%
M 601451
8.3%
E 601451
8.3%
I 601451
8.3%
C 601451
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 601451
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7217412
92.3%
Common 601451
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 1202902
16.7%
A 1202902
16.7%
N 601451
8.3%
O 601451
8.3%
T 601451
8.3%
H 601451
8.3%
M 601451
8.3%
E 601451
8.3%
I 601451
8.3%
C 601451
8.3%
Common
ValueCountFrequency (%)
_ 601451
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7818863
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 1202902
15.4%
A 1202902
15.4%
N 601451
7.7%
O 601451
7.7%
T 601451
7.7%
H 601451
7.7%
_ 601451
7.7%
M 601451
7.7%
E 601451
7.7%
I 601451
7.7%

level0Gid
Text

Missing 

Distinct157
Distinct (%)0.1%
Missing473902
Missing (%)78.8%
Memory size4.6 MiB
2025-01-08T17:54:19.253575image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters382647
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)< 0.1%

Sample

1st rowVEN
2nd rowAFG
3rd rowZ01
4th rowVEN
5th rowZAF
ValueCountFrequency (%)
ven 22481
17.6%
usa 11290
 
8.9%
zaf 9365
 
7.3%
gha 6969
 
5.5%
mar 6781
 
5.3%
idn 6468
 
5.1%
bwa 4488
 
3.5%
bfa 4128
 
3.2%
moz 3329
 
2.6%
pan 3025
 
2.4%
Other values (147) 49225
38.6%
2025-01-08T17:54:19.416475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 57657
15.1%
N 44072
 
11.5%
E 35574
 
9.3%
V 25784
 
6.7%
M 21228
 
5.5%
S 19507
 
5.1%
Z 16511
 
4.3%
G 16274
 
4.3%
F 15976
 
4.2%
B 15801
 
4.1%
Other values (19) 114263
29.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 379759
99.2%
Decimal Number 2888
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 57657
15.2%
N 44072
 
11.6%
E 35574
 
9.4%
V 25784
 
6.8%
M 21228
 
5.6%
S 19507
 
5.1%
Z 16511
 
4.3%
G 16274
 
4.3%
F 15976
 
4.2%
B 15801
 
4.2%
Other values (16) 111375
29.3%
Decimal Number
ValueCountFrequency (%)
0 1444
50.0%
1 1146
39.7%
6 298
 
10.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 379759
99.2%
Common 2888
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 57657
15.2%
N 44072
 
11.6%
E 35574
 
9.4%
V 25784
 
6.8%
M 21228
 
5.6%
S 19507
 
5.1%
Z 16511
 
4.3%
G 16274
 
4.3%
F 15976
 
4.2%
B 15801
 
4.2%
Other values (16) 111375
29.3%
Common
ValueCountFrequency (%)
0 1444
50.0%
1 1146
39.7%
6 298
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 382647
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 57657
15.1%
N 44072
 
11.5%
E 35574
 
9.3%
V 25784
 
6.7%
M 21228
 
5.5%
S 19507
 
5.1%
Z 16511
 
4.3%
G 16274
 
4.3%
F 15976
 
4.2%
B 15801
 
4.1%
Other values (19) 114263
29.9%

level0Name
Text

Missing 

Distinct157
Distinct (%)0.1%
Missing473902
Missing (%)78.8%
Memory size4.6 MiB
2025-01-08T17:54:19.562875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length24
Mean length9.472931971
Min length4

Characters and Unicode

Total characters1208263
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)< 0.1%

Sample

1st rowVenezuela
2nd rowAfghanistan
3rd rowJammu and Kashmir
4th rowVenezuela
5th rowSouth Africa
ValueCountFrequency (%)
venezuela 22481
 
13.1%
united 11945
 
7.0%
states 11376
 
6.6%
south 10173
 
5.9%
africa 9365
 
5.5%
ghana 6969
 
4.1%
morocco 6781
 
3.9%
indonesia 6468
 
3.8%
botswana 4488
 
2.6%
burkina 4128
 
2.4%
Other values (185) 77505
45.1%
2025-01-08T17:54:19.764759image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 157852
 
13.1%
e 138859
 
11.5%
n 95632
 
7.9%
i 91451
 
7.6%
o 66935
 
5.5%
t 62305
 
5.2%
u 52405
 
4.3%
r 46349
 
3.8%
44130
 
3.7%
l 37921
 
3.1%
Other values (49) 414424
34.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 992249
82.1%
Uppercase Letter 168966
 
14.0%
Space Separator 44130
 
3.7%
Other Punctuation 2902
 
0.2%
Dash Punctuation 14
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 157852
15.9%
e 138859
14.0%
n 95632
9.6%
i 91451
9.2%
o 66935
 
6.7%
t 62305
 
6.3%
u 52405
 
5.3%
r 46349
 
4.7%
l 37921
 
3.8%
s 37179
 
3.7%
Other values (19) 205361
20.7%
Uppercase Letter
ValueCountFrequency (%)
S 25721
15.2%
V 22909
13.6%
M 16407
9.7%
B 14067
8.3%
A 12684
7.5%
U 12397
7.3%
G 10000
 
5.9%
I 9818
 
5.8%
P 7377
 
4.4%
C 7240
 
4.3%
Other values (13) 30346
18.0%
Other Punctuation
ValueCountFrequency (%)
' 2873
99.0%
. 18
 
0.6%
, 11
 
0.4%
Space Separator
ValueCountFrequency (%)
44130
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1161215
96.1%
Common 47048
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 157852
13.6%
e 138859
 
12.0%
n 95632
 
8.2%
i 91451
 
7.9%
o 66935
 
5.8%
t 62305
 
5.4%
u 52405
 
4.5%
r 46349
 
4.0%
l 37921
 
3.3%
s 37179
 
3.2%
Other values (42) 374327
32.2%
Common
ValueCountFrequency (%)
44130
93.8%
' 2873
 
6.1%
. 18
 
< 0.1%
- 14
 
< 0.1%
, 11
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1205267
99.8%
None 2996
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 157852
 
13.1%
e 138859
 
11.5%
n 95632
 
7.9%
i 91451
 
7.6%
o 66935
 
5.6%
t 62305
 
5.2%
u 52405
 
4.3%
r 46349
 
3.8%
44130
 
3.7%
l 37921
 
3.1%
Other values (46) 411428
34.1%
None
ValueCountFrequency (%)
ô 2873
95.9%
é 122
 
4.1%
ç 1
 
< 0.1%

level1Gid
Text

Missing 

Distinct906
Distinct (%)0.7%
Missing473930
Missing (%)78.8%
Memory size4.6 MiB
2025-01-08T17:54:19.957823image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.41927212
Min length6

Characters and Unicode

Total characters946113
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique191 ?
Unique (%)0.1%

Sample

1st rowVEN.6_1
2nd rowAFG.15_1
3rd rowZ01.14_1
4th rowVEN.1_1
5th rowZAF.8_1
ValueCountFrequency (%)
ven.1_1 6194
 
4.9%
zaf.8_1 3031
 
2.4%
ven.6_1 2186
 
1.7%
bwa.12_1 2159
 
1.7%
ven.12_1 1504
 
1.2%
caf.16_1 1500
 
1.2%
eth.8_1 1491
 
1.2%
mar.6_1 1470
 
1.2%
ven.24_1 1465
 
1.1%
mar.12_1 1449
 
1.1%
Other values (896) 105072
82.4%
2025-01-08T17:54:20.203128image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 179942
19.0%
_ 127518
13.5%
. 120552
12.7%
A 57624
 
6.1%
N 44072
 
4.7%
2 40011
 
4.2%
E 35574
 
3.8%
V 25784
 
2.7%
M 21227
 
2.2%
4 20211
 
2.1%
Other values (28) 273598
28.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 379684
40.1%
Decimal Number 318359
33.6%
Connector Punctuation 127518
 
13.5%
Other Punctuation 120552
 
12.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 57624
15.2%
N 44072
 
11.6%
E 35574
 
9.4%
V 25784
 
6.8%
M 21227
 
5.6%
S 19503
 
5.1%
Z 16511
 
4.3%
G 16272
 
4.3%
F 15975
 
4.2%
B 15797
 
4.2%
Other values (16) 111345
29.3%
Decimal Number
ValueCountFrequency (%)
1 179942
56.5%
2 40011
 
12.6%
4 20211
 
6.3%
3 16182
 
5.1%
6 13323
 
4.2%
5 12363
 
3.9%
0 10706
 
3.4%
8 10060
 
3.2%
7 8024
 
2.5%
9 7537
 
2.4%
Connector Punctuation
ValueCountFrequency (%)
_ 127518
100.0%
Other Punctuation
ValueCountFrequency (%)
. 120552
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 566429
59.9%
Latin 379684
40.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 57624
15.2%
N 44072
 
11.6%
E 35574
 
9.4%
V 25784
 
6.8%
M 21227
 
5.6%
S 19503
 
5.1%
Z 16511
 
4.3%
G 16272
 
4.3%
F 15975
 
4.2%
B 15797
 
4.2%
Other values (16) 111345
29.3%
Common
ValueCountFrequency (%)
1 179942
31.8%
_ 127518
22.5%
. 120552
21.3%
2 40011
 
7.1%
4 20211
 
3.6%
3 16182
 
2.9%
6 13323
 
2.4%
5 12363
 
2.2%
0 10706
 
1.9%
8 10060
 
1.8%
Other values (2) 15561
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 946113
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 179942
19.0%
_ 127518
13.5%
. 120552
12.7%
A 57624
 
6.1%
N 44072
 
4.7%
2 40011
 
4.2%
E 35574
 
3.8%
V 25784
 
2.7%
M 21227
 
2.2%
4 20211
 
2.1%
Other values (28) 273598
28.9%

level1Name
Text

Missing 

Distinct882
Distinct (%)0.7%
Missing473930
Missing (%)78.8%
Memory size4.6 MiB
2025-01-08T17:54:20.357520image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length9.408136699
Min length3

Characters and Unicode

Total characters1199735
Distinct characters90
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique187 ?
Unique (%)0.1%

Sample

1st rowBolívar
2nd rowKandahar
3rd rowJammu and Kashmir
4th rowAmazonas
5th rowNorthern Cape
ValueCountFrequency (%)
8657
 
4.8%
amazonas 6326
 
3.5%
cape 5248
 
2.9%
northern 4629
 
2.6%
eastern 4155
 
2.3%
bolívar 2189
 
1.2%
north-west 2159
 
1.2%
barat 2126
 
1.2%
west 1913
 
1.1%
western 1800
 
1.0%
Other values (1015) 140238
78.2%
2025-01-08T17:54:20.688062image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 186875
15.6%
r 89084
 
7.4%
n 79844
 
6.7%
e 79607
 
6.6%
o 67186
 
5.6%
t 58756
 
4.9%
i 53839
 
4.5%
s 53114
 
4.4%
51919
 
4.3%
l 43110
 
3.6%
Other values (80) 436401
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 949271
79.1%
Uppercase Letter 178927
 
14.9%
Space Separator 51919
 
4.3%
Dash Punctuation 19040
 
1.6%
Other Punctuation 578
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 186875
19.7%
r 89084
9.4%
n 79844
8.4%
e 79607
8.4%
o 67186
 
7.1%
t 58756
 
6.2%
i 53839
 
5.7%
s 53114
 
5.6%
l 43110
 
4.5%
u 40632
 
4.3%
Other values (47) 197224
20.8%
Uppercase Letter
ValueCountFrequency (%)
S 17121
 
9.6%
C 16461
 
9.2%
N 15709
 
8.8%
A 15683
 
8.8%
M 14877
 
8.3%
B 11765
 
6.6%
T 11508
 
6.4%
E 10199
 
5.7%
K 8533
 
4.8%
W 6887
 
3.8%
Other values (18) 50184
28.0%
Other Punctuation
ValueCountFrequency (%)
! 398
68.9%
' 106
 
18.3%
, 74
 
12.8%
Space Separator
ValueCountFrequency (%)
51919
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19040
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1128198
94.0%
Common 71537
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 186875
16.6%
r 89084
 
7.9%
n 79844
 
7.1%
e 79607
 
7.1%
o 67186
 
6.0%
t 58756
 
5.2%
i 53839
 
4.8%
s 53114
 
4.7%
l 43110
 
3.8%
u 40632
 
3.6%
Other values (75) 376151
33.3%
Common
ValueCountFrequency (%)
51919
72.6%
- 19040
 
26.6%
! 398
 
0.6%
' 106
 
0.1%
, 74
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1180925
98.4%
None 18427
 
1.5%
Latin Ext Additional 383
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 186875
15.8%
r 89084
 
7.5%
n 79844
 
6.8%
e 79607
 
6.7%
o 67186
 
5.7%
t 58756
 
5.0%
i 53839
 
4.6%
s 53114
 
4.5%
51919
 
4.4%
l 43110
 
3.7%
Other values (47) 417591
35.4%
None
ValueCountFrequency (%)
é 6270
34.0%
á 3586
19.5%
í 3080
16.7%
ó 2234
 
12.1%
â 1779
 
9.7%
è 833
 
4.5%
Đ 250
 
1.4%
ô 144
 
0.8%
à 101
 
0.5%
ò 51
 
0.3%
Other values (14) 99
 
0.5%
Latin Ext Additional
ValueCountFrequency (%)
132
34.5%
92
24.0%
60
15.7%
57
14.9%
19
 
5.0%
11
 
2.9%
ế 5
 
1.3%
5
 
1.3%
2
 
0.5%

level2Gid
Text

Missing 

Distinct2378
Distinct (%)1.9%
Missing475037
Missing (%)79.0%
Memory size4.6 MiB
2025-01-08T17:54:20.884428image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length9.659871533
Min length7

Characters and Unicode

Total characters1221143
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique658 ?
Unique (%)0.5%

Sample

1st rowVEN.6.10_1
2nd rowAFG.15.3_1
3rd rowZ01.14.3_1
4th rowVEN.1.6_1
5th rowZAF.8.5_1
ValueCountFrequency (%)
ven.1.5_1 2542
 
2.0%
bwa.12.2_1 1980
 
1.6%
ven.1.1_1 1644
 
1.3%
caf.16.2_1 1500
 
1.2%
eth.8.8_1 1196
 
0.9%
zaf.8.5_1 1077
 
0.9%
ven.6.10_1 1052
 
0.8%
sle.2.1_1 1049
 
0.8%
sle.1.2_1 1037
 
0.8%
zaf.8.4_1 1035
 
0.8%
Other values (2368) 112302
88.8%
2025-01-08T17:54:21.128497image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 245856
20.1%
1 216878
17.8%
_ 126414
 
10.4%
2 73592
 
6.0%
A 57593
 
4.7%
N 44067
 
3.6%
E 35567
 
2.9%
4 35248
 
2.9%
3 33422
 
2.7%
5 26946
 
2.2%
Other values (28) 325560
26.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 472519
38.7%
Uppercase Letter 376354
30.8%
Other Punctuation 245856
20.1%
Connector Punctuation 126414
 
10.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 57593
15.3%
N 44067
11.7%
E 35567
 
9.5%
V 25781
 
6.9%
M 21036
 
5.6%
S 19455
 
5.2%
Z 16332
 
4.3%
G 16226
 
4.3%
F 15974
 
4.2%
B 15301
 
4.1%
Other values (16) 109022
29.0%
Decimal Number
ValueCountFrequency (%)
1 216878
45.9%
2 73592
 
15.6%
4 35248
 
7.5%
3 33422
 
7.1%
5 26946
 
5.7%
6 22912
 
4.8%
0 17324
 
3.7%
8 16328
 
3.5%
7 15121
 
3.2%
9 14748
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 245856
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 126414
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 844789
69.2%
Latin 376354
30.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 57593
15.3%
N 44067
11.7%
E 35567
 
9.5%
V 25781
 
6.9%
M 21036
 
5.6%
S 19455
 
5.2%
Z 16332
 
4.3%
G 16226
 
4.3%
F 15974
 
4.2%
B 15301
 
4.1%
Other values (16) 109022
29.0%
Common
ValueCountFrequency (%)
. 245856
29.1%
1 216878
25.7%
_ 126414
15.0%
2 73592
 
8.7%
4 35248
 
4.2%
3 33422
 
4.0%
5 26946
 
3.2%
6 22912
 
2.7%
0 17324
 
2.1%
8 16328
 
1.9%
Other values (2) 29869
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1221143
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 245856
20.1%
1 216878
17.8%
_ 126414
 
10.4%
2 73592
 
6.0%
A 57593
 
4.7%
N 44067
 
3.6%
E 35567
 
2.9%
4 35248
 
2.9%
3 33422
 
2.7%
5 26946
 
2.2%
Other values (28) 325560
26.7%

level2Name
Text

Missing 

Distinct2276
Distinct (%)1.8%
Missing475037
Missing (%)79.0%
Memory size4.6 MiB
2025-01-08T17:54:21.306316image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length27
Mean length8.596326356
Min length2

Characters and Unicode

Total characters1086696
Distinct characters121
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique606 ?
Unique (%)0.5%

Sample

1st rowSifontes
2nd rowDaman
3rd rowBandipore
4th rowMaroa
5th rowSiyanda
ValueCountFrequency (%)
west 3557
 
2.1%
manapiare 2542
 
1.5%
ngamiland 2159
 
1.3%
south 2001
 
1.2%
alto 1647
 
1.0%
orinoco 1644
 
1.0%
east 1641
 
1.0%
nola 1500
 
0.9%
bolívar 1475
 
0.9%
miranda 1344
 
0.8%
Other values (2539) 146175
88.2%
2025-01-08T17:54:21.542356image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 156174
 
14.4%
o 74427
 
6.8%
n 72702
 
6.7%
e 72325
 
6.7%
i 68255
 
6.3%
r 58313
 
5.4%
t 45881
 
4.2%
u 41995
 
3.9%
39271
 
3.6%
l 37304
 
3.4%
Other values (111) 420049
38.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 874904
80.5%
Uppercase Letter 166128
 
15.3%
Space Separator 39271
 
3.6%
Dash Punctuation 3492
 
0.3%
Other Punctuation 1688
 
0.2%
Decimal Number 661
 
0.1%
Open Punctuation 293
 
< 0.1%
Close Punctuation 259
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 156174
17.9%
o 74427
 
8.5%
n 72702
 
8.3%
e 72325
 
8.3%
i 68255
 
7.8%
r 58313
 
6.7%
t 45881
 
5.2%
u 41995
 
4.8%
l 37304
 
4.3%
s 33386
 
3.8%
Other values (59) 214142
24.5%
Uppercase Letter
ValueCountFrequency (%)
S 17317
 
10.4%
M 16983
 
10.2%
B 14294
 
8.6%
A 13999
 
8.4%
C 12443
 
7.5%
K 12120
 
7.3%
N 10665
 
6.4%
T 9065
 
5.5%
G 7232
 
4.4%
P 6475
 
3.9%
Other values (24) 45535
27.4%
Decimal Number
ValueCountFrequency (%)
1 386
58.4%
0 164
24.8%
3 52
 
7.9%
7 37
 
5.6%
5 9
 
1.4%
2 5
 
0.8%
8 3
 
0.5%
9 2
 
0.3%
6 2
 
0.3%
4 1
 
0.2%
Other Punctuation
ValueCountFrequency (%)
' 823
48.8%
/ 453
26.8%
. 408
24.2%
, 4
 
0.2%
Space Separator
ValueCountFrequency (%)
39271
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3492
100.0%
Open Punctuation
ValueCountFrequency (%)
( 293
100.0%
Close Punctuation
ValueCountFrequency (%)
) 259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1041032
95.8%
Common 45664
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 156174
15.0%
o 74427
 
7.1%
n 72702
 
7.0%
e 72325
 
6.9%
i 68255
 
6.6%
r 58313
 
5.6%
t 45881
 
4.4%
u 41995
 
4.0%
l 37304
 
3.6%
s 33386
 
3.2%
Other values (93) 380270
36.5%
Common
ValueCountFrequency (%)
39271
86.0%
- 3492
 
7.6%
' 823
 
1.8%
/ 453
 
1.0%
. 408
 
0.9%
1 386
 
0.8%
( 293
 
0.6%
) 259
 
0.6%
0 164
 
0.4%
3 52
 
0.1%
Other values (8) 63
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1067404
98.2%
None 19124
 
1.8%
Latin Ext Additional 168
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 156174
 
14.6%
o 74427
 
7.0%
n 72702
 
6.8%
e 72325
 
6.8%
i 68255
 
6.4%
r 58313
 
5.5%
t 45881
 
4.3%
u 41995
 
3.9%
39271
 
3.7%
l 37304
 
3.5%
Other values (60) 400757
37.5%
None
ValueCountFrequency (%)
é 5729
30.0%
á 3632
19.0%
í 3054
16.0%
ú 1751
 
9.2%
ó 1611
 
8.4%
è 1050
 
5.5%
ô 559
 
2.9%
ñ 346
 
1.8%
â 323
 
1.7%
É 204
 
1.1%
Other values (29) 865
 
4.5%
Latin Ext Additional
ValueCountFrequency (%)
49
29.2%
36
21.4%
35
20.8%
15
 
8.9%
9
 
5.4%
5
 
3.0%
5
 
3.0%
4
 
2.4%
4
 
2.4%
3
 
1.8%
Other values (2) 3
 
1.8%

level3Gid
Text

Missing 

Distinct1589
Distinct (%)2.6%
Missing539154
Missing (%)89.6%
Memory size4.6 MiB
2025-01-08T17:54:21.733935image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length11
Mean length11.6395011
Min length11

Characters and Unicode

Total characters725106
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique471 ?
Unique (%)0.8%

Sample

1st rowZ01.14.3.1_1
2nd rowZAF.8.5.3_1
3rd rowZ06.6.1.4_1
4th rowBFA.8.2.6_1
5th rowPHL.59.10.11_1
ValueCountFrequency (%)
sle.1.2.8_1 1037
 
1.7%
eth.8.8.11_1 988
 
1.6%
pan.11.1.1_1 727
 
1.2%
sle.2.1.13_1 717
 
1.2%
mar.6.2.2_1 637
 
1.0%
pan.2.10.3_1 610
 
1.0%
ssd.1.2.1_1 426
 
0.7%
zaf.8.5.3_1 419
 
0.7%
ben.2.5.2_1 418
 
0.7%
pan.4.2.6_1 413
 
0.7%
Other values (1579) 55905
89.7%
2025-01-08T17:54:21.988848image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 186891
25.8%
1 125870
17.4%
_ 62297
 
8.6%
2 41681
 
5.7%
3 25313
 
3.5%
A 25141
 
3.5%
4 22242
 
3.1%
5 18614
 
2.6%
6 15803
 
2.2%
Z 15767
 
2.2%
Other values (24) 185487
25.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 291915
40.3%
Other Punctuation 186891
25.8%
Uppercase Letter 184003
25.4%
Connector Punctuation 62297
 
8.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 25141
13.7%
Z 15767
 
8.6%
N 15438
 
8.4%
F 13775
 
7.5%
M 13210
 
7.2%
E 12819
 
7.0%
R 10146
 
5.5%
I 10093
 
5.5%
D 8484
 
4.6%
B 8385
 
4.6%
Other values (12) 50745
27.6%
Decimal Number
ValueCountFrequency (%)
1 125870
43.1%
2 41681
 
14.3%
3 25313
 
8.7%
4 22242
 
7.6%
5 18614
 
6.4%
6 15803
 
5.4%
8 12708
 
4.4%
0 11747
 
4.0%
9 9752
 
3.3%
7 8185
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 186891
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 62297
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 541103
74.6%
Latin 184003
 
25.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 25141
13.7%
Z 15767
 
8.6%
N 15438
 
8.4%
F 13775
 
7.5%
M 13210
 
7.2%
E 12819
 
7.0%
R 10146
 
5.5%
I 10093
 
5.5%
D 8484
 
4.6%
B 8385
 
4.6%
Other values (12) 50745
27.6%
Common
ValueCountFrequency (%)
. 186891
34.5%
1 125870
23.3%
_ 62297
 
11.5%
2 41681
 
7.7%
3 25313
 
4.7%
4 22242
 
4.1%
5 18614
 
3.4%
6 15803
 
2.9%
8 12708
 
2.3%
0 11747
 
2.2%
Other values (2) 17937
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 725106
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 186891
25.8%
1 125870
17.4%
_ 62297
 
8.6%
2 41681
 
5.7%
3 25313
 
3.5%
A 25141
 
3.5%
4 22242
 
3.1%
5 18614
 
2.6%
6 15803
 
2.2%
Z 15767
 
2.2%
Other values (24) 185487
25.6%

level3Name
Text

Missing 

Distinct1550
Distinct (%)2.5%
Missing539390
Missing (%)89.7%
Memory size4.6 MiB
2025-01-08T17:54:22.165963image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length27
Mean length9.01318058
Min length2

Characters and Unicode

Total characters559367
Distinct characters107
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique451 ?
Unique (%)0.7%

Sample

1st rown.a. ( 4)
2nd rowKai !Garib
3rd rowKargil
4th rowYamba
5th rowMalaking Patag
ValueCountFrequency (%)
ward 1627
 
1.9%
n.a 1294
 
1.5%
1255
 
1.5%
lower 1037
 
1.2%
bambara 1037
 
1.2%
seka 993
 
1.1%
chekorsa 988
 
1.1%
na 839
 
1.0%
arraiján 727
 
0.8%
tambakha 717
 
0.8%
Other values (1794) 75841
87.8%
2025-01-08T17:54:22.403531image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 81980
 
14.7%
i 36730
 
6.6%
n 36575
 
6.5%
o 35678
 
6.4%
e 34413
 
6.2%
r 27496
 
4.9%
u 26462
 
4.7%
24294
 
4.3%
g 17364
 
3.1%
l 17126
 
3.1%
Other values (97) 221249
39.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 434638
77.7%
Uppercase Letter 83597
 
14.9%
Space Separator 24294
 
4.3%
Other Punctuation 5254
 
0.9%
Decimal Number 4700
 
0.8%
Open Punctuation 2638
 
0.5%
Close Punctuation 2632
 
0.5%
Dash Punctuation 1614
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 81980
18.9%
i 36730
 
8.5%
n 36575
 
8.4%
o 35678
 
8.2%
e 34413
 
7.9%
r 27496
 
6.3%
u 26462
 
6.1%
g 17364
 
4.0%
l 17126
 
3.9%
m 14811
 
3.4%
Other values (51) 106003
24.4%
Uppercase Letter
ValueCountFrequency (%)
K 7528
 
9.0%
S 7526
 
9.0%
T 6850
 
8.2%
B 6823
 
8.2%
M 6606
 
7.9%
A 5229
 
6.3%
C 5171
 
6.2%
G 5052
 
6.0%
N 4059
 
4.9%
L 3952
 
4.7%
Other values (17) 24801
29.7%
Decimal Number
ValueCountFrequency (%)
1 1266
26.9%
2 536
11.4%
7 498
 
10.6%
9 455
 
9.7%
6 384
 
8.2%
3 380
 
8.1%
0 346
 
7.4%
4 333
 
7.1%
8 312
 
6.6%
5 190
 
4.0%
Other Punctuation
ValueCountFrequency (%)
. 3068
58.4%
/ 1029
 
19.6%
! 638
 
12.1%
' 317
 
6.0%
, 202
 
3.8%
Space Separator
ValueCountFrequency (%)
24294
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2638
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2632
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1614
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 518235
92.6%
Common 41132
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 81980
15.8%
i 36730
 
7.1%
n 36575
 
7.1%
o 35678
 
6.9%
e 34413
 
6.6%
r 27496
 
5.3%
u 26462
 
5.1%
g 17364
 
3.4%
l 17126
 
3.3%
m 14811
 
2.9%
Other values (78) 189600
36.6%
Common
ValueCountFrequency (%)
24294
59.1%
. 3068
 
7.5%
( 2638
 
6.4%
) 2632
 
6.4%
- 1614
 
3.9%
1 1266
 
3.1%
/ 1029
 
2.5%
! 638
 
1.6%
2 536
 
1.3%
7 498
 
1.2%
Other values (9) 2919
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 554144
99.1%
None 4929
 
0.9%
Latin Ext Additional 294
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 81980
 
14.8%
i 36730
 
6.6%
n 36575
 
6.6%
o 35678
 
6.4%
e 34413
 
6.2%
r 27496
 
5.0%
u 26462
 
4.8%
24294
 
4.4%
g 17364
 
3.1%
l 17126
 
3.1%
Other values (61) 216026
39.0%
None
ValueCountFrequency (%)
é 2169
44.0%
á 961
19.5%
è 846
 
17.2%
ó 452
 
9.2%
ñ 162
 
3.3%
â 63
 
1.3%
ơ 63
 
1.3%
ú 48
 
1.0%
ư 29
 
0.6%
Đ 22
 
0.4%
Other values (13) 114
 
2.3%
Latin Ext Additional
ValueCountFrequency (%)
78
26.5%
46
15.6%
35
11.9%
33
11.2%
ế 21
 
7.1%
19
 
6.5%
12
 
4.1%
11
 
3.7%
10
 
3.4%
9
 
3.1%
Other values (3) 20
 
6.8%

iucnRedListCategory
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing210302
Missing (%)35.0%
Memory size4.6 MiB
2025-01-08T17:54:22.459181image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters782298
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLC
2nd rowLC
3rd rowLC
4th rowLC
5th rowLC
ValueCountFrequency (%)
lc 316013
80.8%
ne 32299
 
8.3%
vu 20062
 
5.1%
nt 8397
 
2.1%
en 8040
 
2.1%
dd 3578
 
0.9%
cr 2355
 
0.6%
ex 375
 
0.1%
ew 30
 
< 0.1%
2025-01-08T17:54:22.552707image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 318368
40.7%
L 316013
40.4%
N 48736
 
6.2%
E 40744
 
5.2%
V 20062
 
2.6%
U 20062
 
2.6%
T 8397
 
1.1%
D 7156
 
0.9%
R 2355
 
0.3%
X 375
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 782298
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 318368
40.7%
L 316013
40.4%
N 48736
 
6.2%
E 40744
 
5.2%
V 20062
 
2.6%
U 20062
 
2.6%
T 8397
 
1.1%
D 7156
 
0.9%
R 2355
 
0.3%
X 375
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 782298
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 318368
40.7%
L 316013
40.4%
N 48736
 
6.2%
E 40744
 
5.2%
V 20062
 
2.6%
U 20062
 
2.6%
T 8397
 
1.1%
D 7156
 
0.9%
R 2355
 
0.3%
X 375
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 782298
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 318368
40.7%
L 316013
40.4%
N 48736
 
6.2%
E 40744
 
5.2%
V 20062
 
2.6%
U 20062
 
2.6%
T 8397
 
1.1%
D 7156
 
0.9%
R 2355
 
0.3%
X 375
 
< 0.1%